RE: Feedback on MLlib roadmap process proposal

2017-01-24 Thread Ilya Matiach
Thanks Sean, this is a really helpful overview, and contains good guidance for new contributors to ML/MLLIB. My confusion was that the ML 2.2 roadmap critical features (https://issues.apache.org/jira/browse/SPARK-18813) did not line up with the top ML/MLLIB JIRAs by Votes

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Liang-Chi Hsieh
Congrats to Burak and Holden! Thanks for your work! Jeff Zhang wrote > Congratulations Burak and Holden! > > Yanbo Liang < > ybliang8@ > >于2017年1月25日周三 上午11:54写道: > >> Congratulations, Burak and Holden. >> >> On Tue, Jan 24, 2017 at 7:32 PM, Chester Chen < > chesterchen@ > > >> wrote: >> >

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Jeff Zhang
Congratulations Burak and Holden! Yanbo Liang 于2017年1月25日周三 上午11:54写道: > Congratulations, Burak and Holden. > > On Tue, Jan 24, 2017 at 7:32 PM, Chester Chen > wrote: > > Congratulation to both. > > > > Holden, we need catch up. > > > > > > *Chester Chen * > > ■ Senior Manager – Data Science &

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Yanbo Liang
Congratulations, Burak and Holden. On Tue, Jan 24, 2017 at 7:32 PM, Chester Chen wrote: > Congratulation to both. > > > > Holden, we need catch up. > > > > > > *Chester Chen * > > ■ Senior Manager – Data Science & Engineering > > 3000 Clearview Way > > San Mateo, CA 94402 > > > > > > *From: *Fe

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Chester Chen
Congratulation to both. Holden, we need catch up. Chester Chen ■ Senior Manager – Data Science & Engineering 3000 Clearview Way San Mateo, CA 94402 [cid:image001.png@01D27678.9466E4D0] From: Felix Cheung Date: Tuesday, January 24, 2017 at 1:20 PM To: Reynold Xin , "dev@spark.apache.org" Cc

Re: MLlib mission and goals

2017-01-24 Thread Joseph Bradley
*Re: performance measurement framework* We (Databricks) used to use spark-perf , but that was mainly for the RDD-based API. We've now switched to spark-sql-perf , which does include some ML benchmarks despite t

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Deepak Sharma
Congratulations Holden & Burak On Wed, Jan 25, 2017 at 8:23 AM, jiangxingbo wrote: > Congratulations Burak & Holden! > > > 在 2017年1月25日,上午2:13,Reynold Xin 写道: > > > > Hi all, > > > > Burak and Holden have recently been elected as Apache Spark committers. > > > > Burak has been very active in a

Re: welcoming Burak and Holden as committers

2017-01-24 Thread jiangxingbo
Congratulations Burak & Holden! > 在 2017年1月25日,上午2:13,Reynold Xin 写道: > > Hi all, > > Burak and Holden have recently been elected as Apache Spark committers. > > Burak has been very active in a large number of areas in Spark, including > linear algebra, stats/maths functions in DataFrames, Py

Re: MLlib mission and goals

2017-01-24 Thread bradc
I believe one of the higher level goals of Spark MLlib should be to improve the efficiency of the ML algorithms that already exist. Currently there ML has a reasonable coverage of the important core algorithms. The work to get to feature parity for DataFrame-based API and model persistence are a

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Nan Zhu
Congratulations! On Tue, Jan 24, 2017 at 4:50 PM, Hyukjin Kwon wrote: > Congratuation!! > > 2017-01-25 9:22 GMT+09:00 Takeshi Yamamuro : > >> Congrats! >> >> // maropu >> >> On Wed, Jan 25, 2017 at 9:20 AM, Kousuke Saruta < >> saru...@oss.nttdata.co.jp> wrote: >> >>> Congrats, Burak and Holden!

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Hyukjin Kwon
Congratuation!! 2017-01-25 9:22 GMT+09:00 Takeshi Yamamuro : > Congrats! > > // maropu > > On Wed, Jan 25, 2017 at 9:20 AM, Kousuke Saruta > wrote: > >> Congrats, Burak and Holden! >> >> - Kousuke >> >> On 2017/01/25 6:36, Herman van Hövell tot Westerflier wrote: >> >> Congrats! >> >> On Tue, Ja

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Takeshi Yamamuro
Congrats! // maropu On Wed, Jan 25, 2017 at 9:20 AM, Kousuke Saruta wrote: > Congrats, Burak and Holden! > > - Kousuke > > On 2017/01/25 6:36, Herman van Hövell tot Westerflier wrote: > > Congrats! > > On Tue, Jan 24, 2017 at 10:20 PM, Felix Cheung > wrote: > >> Congrats and welcome!! >> >> >>

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Kousuke Saruta
Congrats, Burak and Holden! - Kousuke On 2017/01/25 6:36, Herman van Hövell tot Westerflier wrote: Congrats! On Tue, Jan 24, 2017 at 10:20 PM, Felix Cheung mailto:felixcheun...@hotmail.com>> wrote: Congrats and welcome!! ---

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Holden Karau
Also thanks everyone :) Looking forward to helping out (and if anyone wants to get started contributing to PySpark please ping me :)) On Tue, Jan 24, 2017 at 3:24 PM, Burak Yavuz wrote: > Thank you very much everyone! Hoping to help out the community as much as > I can! > > Best, > Burak > > On

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Burak Yavuz
Thank you very much everyone! Hoping to help out the community as much as I can! Best, Burak On Tue, Jan 24, 2017 at 2:29 PM, Jacek Laskowski wrote: > Wow! At long last. Congrats Burak and Holden! > > p.s. I was a bit worried that the process of accepting new committers > is equally hard as pas

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Jacek Laskowski
Wow! At long last. Congrats Burak and Holden! p.s. I was a bit worried that the process of accepting new committers is equally hard as passing Sean's sanity checks for PRs, but given this it's so much easier it seems :D Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Masterin

Re: welcoming Burak and Holden as committers

2017-01-24 Thread zero323
Kudos! -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/welcoming-Burak-and-Holden-as-committers-tp20726p20746.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. -

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Herman van Hövell tot Westerflier
Congrats! On Tue, Jan 24, 2017 at 10:20 PM, Felix Cheung wrote: > Congrats and welcome!! > > > -- > *From:* Reynold Xin > *Sent:* Tuesday, January 24, 2017 10:13:16 AM > *To:* dev@spark.apache.org > *Cc:* Burak Yavuz; Holden Karau > *Subject:* welcoming Burak and Hol

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Felix Cheung
Congrats and welcome!! From: Reynold Xin Sent: Tuesday, January 24, 2017 10:13:16 AM To: dev@spark.apache.org Cc: Burak Yavuz; Holden Karau Subject: welcoming Burak and Holden as committers Hi all, Burak and Holden have recently been elected as Apache Spark com

Re: MLlib mission and goals

2017-01-24 Thread Saikat Kanjilal
In reading through this and thinking about usability is there any interest in building a performance measurement framework around some (or maybe all) of the ML/Lib algorithms, I envision this as something that can get run for each release build for our end users, it may be useful for internal ml

Re: MLlib mission and goals

2017-01-24 Thread Asher Krim
On the topic of usability, I think more effort should be put into large scale testing. We've encountered issues with building large models that are not apparent in small models, and these issues have made productizing ML/MLLIB much more difficult than we first anticipated. Considering that one of t

Re: MLlib mission and goals

2017-01-24 Thread Miao Wang
I started working on ML/MLLIB/R since last year. Here are some of my thoughts from a beginner's perspective:   Current ML/MLLIB core algorithms can serve as good implementation examples, which makes adding new algorithms easier. Even a beginner like me, can pick it up quickly and learn how to add n

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Suresh Thalamati
Congratulations Burak and Holden! -suresh > On Jan 24, 2017, at 10:13 AM, Reynold Xin wrote: > > Hi all, > > Burak and Holden have recently been elected as Apache Spark committers. > > Burak has been very active in a large number of areas in Spark, including > linear algebra, stats/maths fun

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Srabasti Banerjee
Congratulations Holden & Burak :-) ThanksSrabasti On Tuesday, 24 January 2017 10:51 AM, shane knapp wrote: congrats to the both of you!  :) On Tue, Jan 24, 2017 at 10:13 AM, Reynold Xin wrote: > Hi all, > > Burak and Holden have recently been elected as Apache Spark committers. > > B

Re: welcoming Burak and Holden as committers

2017-01-24 Thread shane knapp
congrats to the both of you! :) On Tue, Jan 24, 2017 at 10:13 AM, Reynold Xin wrote: > Hi all, > > Burak and Holden have recently been elected as Apache Spark committers. > > Burak has been very active in a large number of areas in Spark, including > linear algebra, stats/maths functions in Data

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Cody Koeninger
Congrats, glad to hear it On Jan 24, 2017 12:47 PM, "Shixiong(Ryan) Zhu" wrote: > Congrats Burak & Holden! > > On Tue, Jan 24, 2017 at 10:39 AM, Joseph Bradley > wrote: > >> Congratulations Burak & Holden! >> >> On Tue, Jan 24, 2017 at 10:33 AM, Dongjoon Hyun >> wrote: >> >>> Great! Congratula

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Shixiong(Ryan) Zhu
Congrats Burak & Holden! On Tue, Jan 24, 2017 at 10:39 AM, Joseph Bradley wrote: > Congratulations Burak & Holden! > > On Tue, Jan 24, 2017 at 10:33 AM, Dongjoon Hyun > wrote: > >> Great! Congratulations, Burak and Holden. >> >> Bests, >> Dongjoon. >> >> On 2017-01-24 10:29 (-0800), Nicholas Ch

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Denny Lee
Awesome! Congrats Burak & Holden!! On Tue, Jan 24, 2017 at 10:39 Joseph Bradley wrote: > Congratulations Burak & Holden! > > On Tue, Jan 24, 2017 at 10:33 AM, Dongjoon Hyun > wrote: > > Great! Congratulations, Burak and Holden. > > Bests, > Dongjoon. > > On 2017-01-24 10:29 (-0800), Nicholas Ch

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Shankar Venkataraman
Congrats Buraj and Holden! On Tue, Jan 24, 2017 at 10:33 AM Dongjoon Hyun wrote: > Great! Congratulations, Burak and Holden. > > Bests, > Dongjoon. > > On 2017-01-24 10:29 (-0800), Nicholas Chammas > wrote: > > 👏 👍 > > > > Congratulations, Burak and Holden. > > > > On Tue, Jan 24, 2017 at 1:27 P

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Joseph Bradley
Congratulations Burak & Holden! On Tue, Jan 24, 2017 at 10:33 AM, Dongjoon Hyun wrote: > Great! Congratulations, Burak and Holden. > > Bests, > Dongjoon. > > On 2017-01-24 10:29 (-0800), Nicholas Chammas > wrote: > > 👏 👍 > > > > Congratulations, Burak and Holden. > > > > On Tue, Jan 24, 2017 at

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Dongjoon Hyun
Great! Congratulations, Burak and Holden. Bests, Dongjoon. On 2017-01-24 10:29 (-0800), Nicholas Chammas wrote: > 👏 👍 > > Congratulations, Burak and Holden. > > On Tue, Jan 24, 2017 at 1:27 PM Russell Spitzer > wrote: > > > Great news! Congratulations! > > > > On Tue, Jan 24, 2017 at

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Nicholas Chammas
👏 👍 Congratulations, Burak and Holden. On Tue, Jan 24, 2017 at 1:27 PM Russell Spitzer wrote: > Great news! Congratulations! > > On Tue, Jan 24, 2017 at 10:25 AM Dean Wampler > wrote: > > Congratulations to both of you! > > dean > > *Dean Wampler, Ph.D.* > Author: Programming Scala, 2nd Editio

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Russell Spitzer
Great news! Congratulations! On Tue, Jan 24, 2017 at 10:25 AM Dean Wampler wrote: > Congratulations to both of you! > > dean > > *Dean Wampler, Ph.D.* > Author: Programming Scala, 2nd Edition > , Fast Data > Architectures for Streaming Applicatio

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Dean Wampler
Congratulations to both of you! dean *Dean Wampler, Ph.D.* Author: Programming Scala, 2nd Edition , Fast Data Architectures for Streaming Applications , Funct

Re: [VOTE] Release Apache Parquet 1.8.2 RC1

2017-01-24 Thread Ryan Blue
Michael, the problem you're hitting is that Parquet's dependency moved to Avro 1.8.0 from 1.7.7 and added a method that is missing when you use 1.7.7. It looks like Spark is still using Avro 1.7.7. I think updating that to 1.8.x should fix the problem. rb On Tue, Jan 24, 2017 at 5:35 AM, Michael

Re: welcoming Burak and Holden as committers

2017-01-24 Thread Xiao Li
Congratulations! Burak and Holden! 2017-01-24 10:13 GMT-08:00 Reynold Xin : > Hi all, > > Burak and Holden have recently been elected as Apache Spark committers. > > Burak has been very active in a large number of areas in Spark, including > linear algebra, stats/maths functions in DataFrames, Py

welcoming Burak and Holden as committers

2017-01-24 Thread Reynold Xin
Hi all, Burak and Holden have recently been elected as Apache Spark committers. Burak has been very active in a large number of areas in Spark, including linear algebra, stats/maths functions in DataFrames, Python/R APIs for DataFrames, dstream, and most recently Structured Streaming. Holden has

Re: Feedback on MLlib roadmap process proposal

2017-01-24 Thread Cody Koeninger
Totally agree with most of what Sean said, just wanted to give an alternate take on the "maintainers" thing On Tue, Jan 24, 2017 at 10:23 AM, Sean Owen wrote: > There is no such list because there's no formal notion of ownership or > access to subsets of the project. Tracking an informal notion w

Re: Feedback on MLlib roadmap process proposal

2017-01-24 Thread Sean Owen
On Tue, Jan 24, 2017 at 3:58 PM Ilya Matiach wrote: > Just a few questions with regards to the MLLIB process: > > > >1. Is there a list of committers who can/are shepherds and what code >they own? I’ve seen this page: http://spark.apache.org/committers.html >but I’m not sure if it is

RE: Feedback on MLlib roadmap process proposal

2017-01-24 Thread Ilya Matiach
Just a few questions with regards to the MLLIB process: 1. Is there a list of committers who can/are shepherds and what code they own? I’ve seen this page: http://spark.apache.org/committers.html but I’m not sure if it is up to date and it doesn’t mention what code the committers own. It

[YARN] $ and $$ in prepareCommand to resolve environment in ExecutorRunnable?

2017-01-24 Thread Jacek Laskowski
Hi, Just noticed that [1] (and also [3]) is very cautious with $ and $$ to expand environment variables. javaOpts += "-Djava.io.tmpdir=" + new Path( YarnSparkHadoopUtil.expandEnvironment(Environment.PWD), // <-- here YarnConfiguration.DEFAULT_CONTAINER_TEMP_DIR )

[SPARK-16046] PR Review

2017-01-24 Thread Anton Okolnychyi
Hi all, there is a pull request that I would like to bring back to life. It is related to the SQL programming guide and can be found here . I believe the PR should be helpful. The initial review is done already. Also, I updated it recently and checked t

Re: [VOTE] Release Apache Parquet 1.8.2 RC1

2017-01-24 Thread Michael Heuer
Per comment https://github.com/bigdatagenomics/adam/pull/1360#issuecomment-274681650 and Jenkins failure https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1757/HADOOP_VERSION=2.6.0,SCALAVER=2.10,SPARK_VERSION=1.5.2,label=centos/ when bumping our build to 1.8.2-rc1 unit tests succeed but we enc

Re: MLlib mission and goals

2017-01-24 Thread Stephen Boesch
re: spark-packages.org and "Would these really be better in the core project?" That was not at all the intent of my input: instead to ask "how and where to structure/place deployment quality code that yet were *not* part of the distribution?" The spark packages has no curation whatsoever : no

Re: MLlib mission and goals

2017-01-24 Thread Jörn Franke
I also agree with Joseph and Sean. With respect to spark-packages. I think the issue is that you have to manually add it, although it basically fetches the package from Maven Central (or custom upload). From an organizational perspective there are other issues. E.g. You have to download it from

Re: MLlib mission and goals

2017-01-24 Thread Sean Owen
My $0.02, which shouldn't be weighted too much. I believe the mission as of Spark ML has been to provide the framework, and then implementation of 'the basics' only. It should have the tools that cover ~80% of use cases, out of the box, in a pretty well-supported and tested way. It's not a goal t