Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Stephen Haberman
> > http://maven.apache.org/plugins/maven-install-plugin/ > examples/specific-local-repo.html Hm, I didn't know about that plugin--assuming it does all of the jar/pom/sources/etc., then, yes, that could work... At first glance, I'm not sure it'll bring over the pom with all of the transitive dep

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Sean Owen
Stephen you can publish the artifact to your repo under a different name, right? IIRC Maven will take care of the pom change along the way. Yes you would not ever want to mess with changing an artifact after it's published. http://maven.apache.org/plugins/maven-install-plugin/examples/specific-loc

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Stephen Haberman
Awesome, sounds great, guys; thanks for understanding. Depending on how badly I need 1.1.1-rc2 (I'll check my jobs tomorrow) I'll just build a local version for now. Should be easy, it's just been awhile. :-) Thanks, Stephen On Sun Nov 23 2014 at 11:01:09 PM Patrick Wendell wrote: > Hey Steph

2 spark streaming questions

2014-11-23 Thread tian zhang
Hi, Dear Spark Streaming Developers and Users, We are prototyping using spark streaming and hit the following 2 issues thatI would like to seek your expertise. 1) We have a spark streaming application in scala, that reads  data from Kafka intoa DStream, does some processing and output a transfor

Re: Notes on writing complex spark applications

2014-11-23 Thread Patrick Wendell
Hey Evan, It might be nice to merge this into existing documentation. In particular, a lot of this could serve to update the current tuning section and programming guides. It could also work to paste this wholesale as a reference for Spark users, but in that case it's less likely to get updated w

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Patrick Wendell
Hey Stephen, Thanks for bringing this up. Technically when we call a release vote it needs to be on the exact commit that will be the final release. However, one thing I've thought of doing for a while would be to publish the maven artifacts using a version tag with $VERSION-rcX even if the underl

Re: Notes on writing complex spark applications

2014-11-23 Thread Inkyu Lee
Very helpful!! thank you very much! 2014-11-24 2:17 GMT+09:00 Sam Bessalah : > Thanks Evan, this is great. > On Nov 23, 2014 5:58 PM, "Evan R. Sparks" wrote: > > > Hi all, > > > > Shivaram Venkataraman, Joseph Gonzalez, Tomer Kaftan, and I have been > > working on a short document about writing

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Matei Zaharia
Interesting, perhaps we could publish each one with two IDs, of which the rc one is unofficial. The problem is indeed that you have to vote on a hash for a potentially final artifact. Matei > On Nov 23, 2014, at 7:54 PM, Stephen Haberman > wrote: > > Hi, > > I wanted to try 1.1.1-rc2 becaus

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Stephen Haberman
Hi, I wanted to try 1.1.1-rc2 because we're running into SPARK-3633, but the"rc" releases not being tagged with "-rcX" means the pre-built artifacts are basically useless to me. (Pedantically, to test a release, I have to upload it into our internal repo, to compile jobs, start clusters, etc. Inv

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Patrick Wendell
+1 (binding). Don't see any evidence of regressions at this point. The issue reported by Hector was not related to this rlease. On Sun, Nov 23, 2014 at 9:50 AM, Debasish Das wrote: > -1 from me...same FetchFailed issue as what Hector saw... > > I am running Netflix dataset and dumping out recomm

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Debasish Das
-1 from me...same FetchFailed issue as what Hector saw... I am running Netflix dataset and dumping out recommendation for all users. It shuffles around 100 GB data on disk to run a reduceByKey per user on utils.BoundedPriorityQueue...The code runs fine with MovieLens1m dataset... I gave Spark 10

Re: Notes on writing complex spark applications

2014-11-23 Thread Sam Bessalah
Thanks Evan, this is great. On Nov 23, 2014 5:58 PM, "Evan R. Sparks" wrote: > Hi all, > > Shivaram Venkataraman, Joseph Gonzalez, Tomer Kaftan, and I have been > working on a short document about writing high performance Spark > applications based on our experience developing MLlib, GraphX, ml-m

Re: Notes on writing complex spark applications

2014-11-23 Thread andy petrella
Cool! On Sun Nov 23 2014 at 5:58:03 PM Evan R. Sparks wrote: > Hi all, > > Shivaram Venkataraman, Joseph Gonzalez, Tomer Kaftan, and I have been > working on a short document about writing high performance Spark > applications based on our experience developing MLlib, GraphX, ml-matrix, > pipeli

Notes on writing complex spark applications

2014-11-23 Thread Evan R. Sparks
Hi all, Shivaram Venkataraman, Joseph Gonzalez, Tomer Kaftan, and I have been working on a short document about writing high performance Spark applications based on our experience developing MLlib, GraphX, ml-matrix, pipelines, etc. It may be a useful document both for users and new Spark develope