Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Sean Owen
On Fri, Aug 29, 2014 at 7:42 AM, Patrick Wendell wrote: > In terms of vendor support for this approach - In the early days > Cloudera asked us to add CDH4 repository and more recently Pivotal and > MapR also asked us to allow linking against their hadoop-client > libraries. So we've added these ba

RE: Working Formula for Hive 0.13?

2014-08-28 Thread Zhan Zhang
I have preliminary patch against spark1.0.2, which is attached to spark-2706. Now I am working on supporting both hive-0.12 and hive-0.13.1 with non-intrusive way (not breaking any existing hive-0.12 when introduce supporting new version). I will attach a proposal to solve multi-version support iss

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Patrick Wendell
Yeah, we can't/won't post MapR binaries on the ASF web space for the release. However, I have been linking to them (at their request) with a clear identifier that it is an incompatible license and a 3rd party build. The only "vendor specific" build property we provide is compatibility with differe

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Matei Zaharia
Personally I'd actually consider putting CDH4 back if there are still users on it. It's always better to be inclusive, and the convenience of a one-click download is high. Do we have a sense on what % of CDH users still use CDH4? Matei On August 28, 2014 at 11:31:13 PM, Sean Owen (so...@clouder

Re: [Spark SQL] off-heap columnar store

2014-08-28 Thread Evan Chan
> >> The reason I'm asking about the columnar compressed format is that >> there are some problems for which Parquet is not practical. > > > Can you elaborate? Sure. - Organization or co has no Hadoop, but significant investment in some other NoSQL store. - Need to efficiently add a new column to

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Sean Owen
(Copying my reply since I don't know if it goes to the mailing list) Great, thanks for explaining the reasoning. You're saying these aren't going into the final release? I think that moots any issue surrounding distributing them then. This is all I know of from the ASF: https://community.apache.o

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Patrick Wendell
Hey Sean, The reason there are no longer CDH-specific builds is that all newer versions of CDH and HDP work with builds for the upstream Hadoop projects. I dropped CDH4 in favor of a newer Hadoop version (2.4) and the Hadoop-without-Hive (also 2.4) build. For MapR - we can't officially post thos

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Sean Owen
+1 I tested the source and Hadoop 2.4 release. Checksums and signatures are OK. Compiles fine with Java 8 on OS X. Tests... don't fail any more than usual. FWIW I've also been using the 1.1.0-SNAPSHOT for some time in another project and have encountered no problems. I notice that the 1.1.0 rele

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Cheng Lian
+1. Tested Spark SQL Thrift server and CLI against a single node standalone cluster. On Thu, Aug 28, 2014 at 9:27 PM, Timothy Chen wrote: > +1 Make-distrubtion works, and also tested simple spark jobs on Spark > on Mesos on 8 node Mesos cluster. > > Tim > > On Thu, Aug 28, 2014 at 8:53 PM, Bura

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Timothy Chen
+1 Make-distrubtion works, and also tested simple spark jobs on Spark on Mesos on 8 node Mesos cluster. Tim On Thu, Aug 28, 2014 at 8:53 PM, Burak Yavuz wrote: > +1. Tested MLlib algorithms on Amazon EC2, algorithms show speed-ups between > 1.5-5x compared to the 1.0.2 release. > > - Origin

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Burak Yavuz
+1. Tested MLlib algorithms on Amazon EC2, algorithms show speed-ups between 1.5-5x compared to the 1.0.2 release. - Original Message - From: "Patrick Wendell" To: dev@spark.apache.org Sent: Thursday, August 28, 2014 8:32:11 PM Subject: Re: [VOTE] Release Apache Spark 1.1.0 (RC2) I'll k

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Patrick Wendell
I'll kick off the vote with a +1. On Thu, Aug 28, 2014 at 7:14 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.1.0! > > The tag to be voted on is v1.1.0-rc2 (commit 711aebb3): > https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit

[VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc2 (commit 711aebb3): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=711aebb329ca28046396af1e34395a0df92b5327 The release files, including signatures, digests, etc. ca

Re: [VOTE] Release Apache Spark 1.1.0 (RC1)

2014-08-28 Thread Patrick Wendell
Okay I'm cancelling this vote in favor of RC2. On Thu, Aug 28, 2014 at 3:27 PM, Mridul Muralidharan wrote: > Thanks for being on top of this Patrick ! And apologies for not being able > to help more. > > Regards, > Mridul > > On Aug 29, 2014 1:30 AM, "Patrick Wendell" wrote: >> >> Mridul - thank

Re: deleted: sql/hive/src/test/resources/golden/case sensitivity on windows

2014-08-28 Thread HongQi
OK, I will create a PR to fix this. thanks for your comments. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/deleted-sql-hive-src-test-resources-golden-case-sensitivity-on-windows-tp8085p8105.html Sent from the Apache Spark Developers List mailing lis

Re: Jira tickets for starter tasks

2014-08-28 Thread Josh Rosen
A JIRA admin needs to add you to the ‘’Contributors” role group in order to allow you to assign issues to yourself.  I’ve added this email address to that group, so you should be set! - Josh On August 28, 2014 at 3:52:57 PM, Bill Bejeck (bbej...@gmail.com) wrote: Hi, How do I get a starter

Re: Jira tickets for starter tasks

2014-08-28 Thread Cheng Lian
You can just start the work :) On Thu, Aug 28, 2014 at 3:52 PM, Bill Bejeck wrote: > Hi, > > How do I get a starter task jira ticket assigned to myself? Or do I just do > the work and issue a pull request with the associated jira number? > > Thanks, > Bill >

Jira tickets for starter tasks

2014-08-28 Thread Bill Bejeck
Hi, How do I get a starter task jira ticket assigned to myself? Or do I just do the work and issue a pull request with the associated jira number? Thanks, Bill

Re: [VOTE] Release Apache Spark 1.1.0 (RC1)

2014-08-28 Thread Mridul Muralidharan
Thanks for being on top of this Patrick ! And apologies for not being able to help more. Regards, Mridul On Aug 29, 2014 1:30 AM, "Patrick Wendell" wrote: > Mridul - thanks for sending this along and for the debugging comments > on the JIRA. I think we have a handle on the issue and we'll patch

"emergency" jenkins restart, aug 29th, 730am-9am PDT -- plus a postmortem

2014-08-28 Thread shane knapp
as with all software upgrades, sometimes things don't always work as expected. a recent change to stapler[1], to verbosely report NotExportableExceptions[2] is spamming our jenkins log file with stack traces, which is growing rather quickly (1.2G since 9am). this has been reported to the jenkins

Re: [VOTE] Release Apache Spark 1.1.0 (RC1)

2014-08-28 Thread Patrick Wendell
Mridul - thanks for sending this along and for the debugging comments on the JIRA. I think we have a handle on the issue and we'll patch it and spin a new RC. We can also update the test coverage to cover LZ4. - Patrick On Thu, Aug 28, 2014 at 9:27 AM, Mridul Muralidharan wrote: > Is SPARK-3277

New SparkR mailing list, JIRA

2014-08-28 Thread Shivaram Venkataraman
Hi I'd like to announce a couple of updates to the SparkR project. In order to facilitate better collaboration for new features and development we have a new mailing list, issue tracker for SparkR. - The new JIRA is hosted at https://sparkr.atlassian.net/browse/SPARKR/ and we have migrated all ex

Re: deleted: sql/hive/src/test/resources/golden/case sensitivity on windows

2014-08-28 Thread Josh Rosen
RE: building Spark on Windows: earlier this week, I tried running the Maven build on Windows 8 using the master branch and ran into a few issues.  I’ve opened a PR to fix them https://github.com/apache/spark/pull/2165. On August 28, 2014 at 1:47:58 AM, Cheng Lian (lian.cs@gmail.com) wrote: C

Re: jenkins maintenance/downtime, aug 28th, 730am-9am PDT

2014-08-28 Thread shane knapp
no problem! also, i retriggered: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19406 it's currently: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19411 On Thu, Aug 28, 2014 at 9:46 AM, Reynold Xin wrote: > Thanks for doing this, Shane. > > > On Thursd

Re: jenkins maintenance/downtime, aug 28th, 730am-9am PDT

2014-08-28 Thread Reynold Xin
Thanks for doing this, Shane. On Thursday, August 28, 2014, shane knapp wrote: > all clear: jenkins and all plugins have been updated! > > > On Thu, Aug 28, 2014 at 7:51 AM, shane knapp > wrote: > > > jenkins is upgraded, but a few jobs sneaked in before i could do the > > plugin updates. i'v

Re: [VOTE] Release Apache Spark 1.1.0 (RC1)

2014-08-28 Thread Mridul Muralidharan
Is SPARK-3277 applicable to 1.1 ? If yes, until it is fixed, I am -1 on the release (I am on break, so can't verify or help fix, sorry). Regards Mridul On 28-Aug-2014 9:33 pm, "Patrick Wendell" wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.1.0! > > The ta

Re: jenkins maintenance/downtime, aug 28th, 730am-9am PDT

2014-08-28 Thread shane knapp
all clear: jenkins and all plugins have been updated! On Thu, Aug 28, 2014 at 7:51 AM, shane knapp wrote: > jenkins is upgraded, but a few jobs sneaked in before i could do the > plugin updates. i've put jenkins in quiet mode again, and once the spark > builds finish, i'll restart jenkins to

Re: jenkins maintenance/downtime, aug 28th, 730am-9am PDT

2014-08-28 Thread shane knapp
this one job is blocking the jenkins restart: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19406/ i'm about to kill it so that i can get this done. i'll restart the job after jenkins is back up. On Thu, Aug 28, 2014 at 7:51 AM, shane knapp wrote: > jenkins is upgraded, b

[VOTE] Release Apache Spark 1.1.0 (RC1)

2014-08-28 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc1 (commit f0718324): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=f07183249b74dd857069028bf7d570b35f265585 The release files, including signatures, digests, etc. ca

Re: jenkins maintenance/downtime, aug 28th, 730am-9am PDT

2014-08-28 Thread shane knapp
jenkins is upgraded, but a few jobs sneaked in before i could do the plugin updates. i've put jenkins in quiet mode again, and once the spark builds finish, i'll restart jenkins to enable the plugin updates and we'll be good to go. let's all take a moment to bask in the glory of the shiny new UI!

Re: jenkins maintenance/downtime, aug 28th, 730am-9am PDT

2014-08-28 Thread shane knapp
jenkins is now coming down. On Thu, Aug 28, 2014 at 7:19 AM, shane knapp wrote: > reminder: this is starting in 10 minutes > > > On Wed, Aug 27, 2014 at 4:13 PM, shane knapp wrote: > >> tomorrow morning i will be upgrading jenkins to the latest/greatest >> (1.577). >> >> at 730am, i will put

Re: jenkins maintenance/downtime, aug 28th, 730am-9am PDT

2014-08-28 Thread shane knapp
reminder: this is starting in 10 minutes On Wed, Aug 27, 2014 at 4:13 PM, shane knapp wrote: > tomorrow morning i will be upgrading jenkins to the latest/greatest > (1.577). > > at 730am, i will put jenkins in to a quiet period, so no new builds will > be accepted. once any running builds are

Re: deleted: sql/hive/src/test/resources/golden/case sensitivity on windows

2014-08-28 Thread Cheng Lian
Colon is not allowed to be part of a Windows file name and I think Git just cannot create this file while cloning. Remove the colon in the name string of this test case

deleted: sql/hive/src/test/resources/golden/case sensitivity on windows

2014-08-28 Thread 洪奇
Hi, I want to contribute some code to mllib, I forked apache/spark to my own repository (chouqin/spark), and used `git clone https://github.com/chouqin/spark.git` to checkout the code my windows system. In this directory, I run `git status` before doing anything, it output this: ``` On branch