Re: real world spark code

2017-07-25 Thread Matei Zaharia
ob" Wakefield, MBA >>> Principal >>> Mass Street Analytics, LLC >>> 913.938.6685 >>> www.massstreet.net >>> www.linkedin.com/in/bobwakefieldmba >>> Twitter: @BobLovesData >>> >>> >>> From: Jörn Franke [ma

Re: real world spark code

2017-07-25 Thread Frank Austin Nothaft
anke [mailto:jornfra...@gmail.com >> <mailto:jornfra...@gmail.com>] >> Sent: Tuesday, July 25, 2017 8:31 AM >> To: Adaryl Wakefield > <mailto:adaryl.wakefi...@hotmail.com>> >> Cc: user@spark.apache.org <mailto:user@spark.apache.org> >> Subject:

Re: real world spark code

2017-07-25 Thread Jörn Franke
yl Wakefield > Cc: user@spark.apache.org > Subject: Re: real world spark code > > Look for the ones that have unit and integration tests as well as a > ci+reporting on code quality. > > All the others are just toy examples. Well should be :) > > On 25. Jul 2017, at 01

RE: real world spark code

2017-07-25 Thread Adaryl Wakefield
Twitter: @BobLovesData<http://twitter.com/BobLovesData> From: Jörn Franke [mailto:jornfra...@gmail.com] Sent: Tuesday, July 25, 2017 8:31 AM To: Adaryl Wakefield Cc: user@spark.apache.org Subject: Re: real world spark code Look for the ones that have unit and integration tests as well as a

Re: real world spark code

2017-07-25 Thread Jörn Franke
Look for the ones that have unit and integration tests as well as a ci+reporting on code quality. All the others are just toy examples. Well should be :) > On 25. Jul 2017, at 01:08, Adaryl Wakefield > wrote: > > Anybody know of publicly available GitHub repos of real world Spark > applicati

Re: real world spark code

2017-07-25 Thread Xiayun Sun
usually I look in github repos of those big name companies that I know are actively doing machine learning. For example, here are two spark-related repos from soundcloud: - https://github.com/soundcloud/spark-pagerank - https://github.com/soundcloud/cosine-lsh-join-spark On 25 July 2017 at 06:08,