Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-09-01 Thread Chester Chen
Thanks Sean, that make it clear. On Tue, Sep 1, 2015 at 7:17 AM, Sean Owen wrote: > Any 1.5 RC comes from the latest state of the 1.5 branch at some point > in time. The next RC will be cut from whatever the latest commit is. > You can see the tags in git for the specific commits for each RC. >

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-09-01 Thread Sean Owen
Any 1.5 RC comes from the latest state of the 1.5 branch at some point in time. The next RC will be cut from whatever the latest commit is. You can see the tags in git for the specific commits for each RC. There's no such thing as "1.5.1 SNAPSHOT" commits, just commits to branch 1.5. I would ignore

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-09-01 Thread chester
Thanks for the explanation. Since 1.5.0 rc3 is not yet released, I assume it would cut from 1.5 branch, doesn't that bring 1.5.1 snapshot code ? The reason I am asking these questions is that I would like to know If I want build 1.5.0 myself, which commit should I use ? Sent from my iPad >

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-09-01 Thread Sean Owen
The head of branch 1.5 will always be a "1.5.x-SNAPSHOT" version. Yeah technically you would expect it to be 1.5.0-SNAPSHOT until 1.5.0 is released. In practice I think it's simpler to follow the defaults of the Maven release plugin, which will set this to 1.5.1-SNAPSHOT after any 1.5.0-rc is relea

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-09-01 Thread chester
Sorry, I am still not follow. I assume the release would build from 1.5.0 before moving to 1.5.1. Are you saying the 1.5.0 rc3 could build from 1.5.1 snapshot during release ? Or 1.5.0 rc3 would build from the last commit of 1.5.0 (before changing to 1.5.1 snapshot) ? Sent from my iPad > On

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-09-01 Thread Sean Owen
That's correct for the 1.5 branch, right? this doesn't mean that the next RC would have this value. You choose the release version during the release process. On Tue, Sep 1, 2015 at 2:40 AM, Chester Chen wrote: > Seems that Github branch-1.5 already changing the version to 1.5.1-SNAPSHOT, > > I a

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-31 Thread Chester Chen
Seems that Github branch-1.5 already changing the version to 1.5.1-SNAPSHOT, I am a bit confused are we still on 1.5.0 RC3 or we are in 1.5.1 ? Chester On Mon, Aug 31, 2015 at 3:52 PM, Reynold Xin wrote: > I'm going to -1 the release myself since the issue @yhuai identified is > pretty serious

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-31 Thread Reynold Xin
I'm going to -1 the release myself since the issue @yhuai identified is pretty serious. It basically OOMs the driver for reading any files with a large number of partitions. Looks like the patch for that has already been merged. I'm going to cut rc3 momentarily. On Sun, Aug 30, 2015 at 11:30 AM,

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-30 Thread Sandy Ryza
+1 (non-binding) built from source and ran some jobs against YARN -Sandy On Sat, Aug 29, 2015 at 5:50 AM, vaquar khan wrote: > > +1 (1.5.0 RC2)Compiled on Windows with YARN. > > Regards, > Vaquar khan > +1 (non-binding, of course) > > 1. Compiled OSX 10.10 (Yosemite) OK Total time: 42:36 min >

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-29 Thread vaquar khan
+1 (1.5.0 RC2)Compiled on Windows with YARN. Regards, Vaquar khan +1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 42:36 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ri

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-28 Thread Yin Huai
-1 Found a problem on reading partitioned table. Right now, we may create a SQL project/filter operator for every partition. When we have thousands of partitions, there will be a huge number of SQLMetrics (accumulators), which causes high memory pressure to the driver and then takes down the clust

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-28 Thread Jon Bender
Marcelo, Thanks for replying -- after looking at my test again, I misinterpreted another issue I'm seeing which is unrelated (note I'm not using a pre-built binary, rather had to build my own with Yarn/Hive support, as I want to use it on an older cluster (CDH5.1.0)). I can start up a pyspark app

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-28 Thread Shivaram Venkataraman
I've seen similar tar file warnings and in my case it was because I was using the default tar on a Macbook. Using gnu-tar from brew made the warnings go away. Thanks Shivaram On Fri, Aug 28, 2015 at 2:37 PM, Luciano Resende wrote: > The binary archives seems to be having some issues, which seems

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-28 Thread Luciano Resende
The binary archives seems to be having some issues, which seems consistent on few of the different ones (different versions of hadoop) that I tried. tar -xvf spark-1.5.0-bin-hadoop2.6.tgz x spark-1.5.0-bin-hadoop2.6/lib/spark-examples-1.5.0-hadoop2.6.0.jar x spark-1.5.0-bin-hadoop2.6/lib/spark-a

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-28 Thread Marcelo Vanzin
Hi Jonathan, Can you be more specific about what problem you're running into? SPARK-6869 fixed the issue of pyspark vs. assembly jar by shipping the pyspark archives separately to YARN. With that fix in place, pyspark doesn't need to get anything from the Spark assembly, so it has no problems run

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-28 Thread Jonathan Bender
-1 for regression on PySpark + YARN support It seems like this JIRA https://issues.apache.org/jira/browse/SPARK-7733 added a requirement for Java 7 in the build process. Due to some quirks with the Java archive format changes between Java 6 and 7, using PySpark with a YARN uberjar seems to break

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-27 Thread Krishna Sankar
+1 (non-binding, of course) 1. Compiled OSX 10.10 (Yosemite) OK Total time: 42:36 min mvn clean package -Pyarn -Phadoop-2.6 -DskipTests 2. Tested pyspark, mllib 2.1. statistics (min,max,mean,Pearson,Spearman) OK 2.2. Linear/Ridge/Laso Regression OK 2.3. Decision Tree, Naive Bayes OK 2.4. KMea

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-27 Thread Reynold Xin
Marcelo - please submit a patch anyway. If we don't include it in this release, it will go into 1.5.1. On Thu, Aug 27, 2015 at 4:56 PM, Marcelo Vanzin wrote: > On Thu, Aug 27, 2015 at 4:42 PM, Marcelo Vanzin > wrote: > > The Windows issue Sen raised could be considered a regression / > > blo

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-27 Thread Marcelo Vanzin
On Thu, Aug 27, 2015 at 4:42 PM, Marcelo Vanzin wrote: > The Windows issue Sen raised could be considered a regression / > blocker, though, and it's a one line fix. If we feel that's important, > let me know and I'll put up a PR against branch-1.5. Looks like Josh just found a blocker, so maybe w

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-27 Thread Marcelo Vanzin
+1. I tested the "without hadoop" binary package and ran our internal tests on it with dynamic allocation both on and off. The Windows issue Sen raised could be considered a regression / blocker, though, and it's a one line fix. If we feel that's important, let me know and I'll put up a PR against

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-27 Thread Sen Fang
Agree on the line fix. I'm submitting from Windows to YARN running on Linux. I imagine that this isn't that uncommon especially for developers working in corporate setting. On Thu, Aug 27, 2015 at 12:52 PM Marcelo Vanzin wrote: > Are you just submitting from Windows or are you also running YARN

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-27 Thread Marcelo Vanzin
Are you just submitting from Windows or are you also running YARN on Windows? If the former, I think the only fix that would be needed is this line (from that same patch): https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala#L434 I don't believ

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-27 Thread saurfang
Nevermind. It looks like this has been fixed in https://github.com/apache/spark/pull/8053 but didn't make the cut? Even though the associated JIRA is targeted for 1.6, I was able to submit to YARN from Windows without a problem with 1.4. I'm wondering if this fix will be merged to 1.5 branch. Let m

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-27 Thread saurfang
Compiled on Windows with YARN and HIVE. However I got exception when submitting application to YARN due to: java.net.URISyntaxException: Illegal character in opaque part at index 2: D:\TEMP\spark-b32c5b5b-a9fa-4cfd-a233-3977588d4092\__spark_conf__1960856096319316224.zip at java.net.URI$Pa

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-26 Thread Calvin Jia
+1, tested that 1.5.0-RC2 works with Tachyon 0.7.1 as external block store.

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-26 Thread Reynold Xin
One small update -- the vote should close Saturday Aug 29. Not Friday Aug 29. On Tue, Aug 25, 2015 at 9:28 PM, Reynold Xin wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.5.0. The vote is open until Friday, Aug 29, 2015 at 5:00 UTC and passes > if a majorit

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-26 Thread Reynold Xin
The Scala 2.11 issue should be fixed, but doesn't need to be a blocker, since Maven builds fine. The sbt build is more aggressive to make sure we catch warnings. On Wed, Aug 26, 2015 at 10:01 AM, Sean Owen wrote: > My quick take: no blockers at this point, except for one potential > issue. Sti

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-26 Thread Sean Owen
My quick take: no blockers at this point, except for one potential issue. Still some 'critical' bugs worth a look. The release seems to pass tests but i get a lot of spurious failures; it took about 16 hours of running tests to get everything to pass at least once. Current score: 56 issues target

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-26 Thread Luc Bourlier
- tested the backpressure/rate controlling in streaming. It works as expected. - there is a problem with the Scala 2.11 sbt build: https://issues.apache.org/jira/browse/SPARK-10227 Luc Bourlier Luc Bourlier *Spark Team - Typesafe, Inc.* luc.bourl...@typesafe.com On We

Re: [VOTE] Release Apache Spark 1.5.0 (RC2)

2015-08-26 Thread rake
rxin wrote > > > The release files, including signatures, digests, etc. can be found at: > http://people.apache.org/~pwendell/spark-releases/spark-1.5.0-rc2-bin/ > > Release artifacts are signed with the following key: > https://people.apache.org/keys/committer/pwendell.asc > > I was