Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Stephen Haberman
Hi, I wanted to try 1.1.1-rc2 because we're running into SPARK-3633, but the"rc" releases not being tagged with "-rcX" means the pre-built artifacts are basically useless to me. (Pedantically, to test a release, I have to upload it into our internal repo, to compile jobs, start clusters, etc. Inv

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Stephen Haberman
PARK-4568 > > - Patrick > > On Sun, Nov 23, 2014 at 8:11 PM, Matei Zaharia > wrote: > > Interesting, perhaps we could publish each one with two IDs, of which > the rc one is unofficial. The problem is indeed that you have to vote on a > hash for a potentially final artifa

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Stephen Haberman
> > http://maven.apache.org/plugins/maven-install-plugin/ > examples/specific-local-repo.html Hm, I didn't know about that plugin--assuming it does all of the jar/pom/sources/etc., then, yes, that could work... At first glance, I'm not sure it'll bring over the pom with all of the transitive dep

Re: better compression codecs for shuffle blocks?

2014-07-14 Thread Stephen Haberman
Just a comment from the peanut gallery, but these buffers are a real PITA for us as well. Probably 75% of our non-user-error job failures are related to them. Just naively, what about not doing compression on the fly? E.g. during the shuffle just write straight to disk, uncompressed? For us, we

Re: small (yet major) change going in: broadcasting RDD to reduce task size

2014-07-16 Thread Stephen Haberman
Wow. Great writeup. I keep tabs on several open source projects that we use heavily, and I'd be ecstatic if more major changes were this well/succinctly explained instead of the usual "just read the commit message/diff". - Stephen