[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-02-26 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/11#discussion_r10114841 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -618,10 +619,6 @@ class PairRDDFunctions[K: ClassTag, V: ClassTag](self: RDD[

[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11#issuecomment-36218612 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/11#issuecomment-36218602 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-02-26 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/11#discussion_r10114812 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -77,7 +74,6 @@ class PairRDDFunctions[K: ClassTag, V: ClassTag](self: RDD[(K,

[GitHub] spark pull request: [SPARK-1100] prevent Spark from overwriting di...

2014-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/11#issuecomment-36218547 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Show Master status on UI page

2014-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/24#issuecomment-36218515 Thanks, merged into master. I also made sure enumerations print nicely because I wasn't sure they do... turns out they do. --- If your project is set up for it, you can r

[GitHub] spark pull request: SPARK-1125: When using a http proxy,the maven ...

2014-02-26 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/25#issuecomment-36218375 Perhaps no one encountered the same problem.Well, let me close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: SPARK-1125: When using a http proxy,the maven ...

2014-02-26 Thread witgo
Github user witgo closed the pull request at: https://github.com/apache/spark/pull/25 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled

[GitHub] spark pull request: [SPARK-1089] fix the regression problem on ADD...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13#issuecomment-36218101 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1089] fix the regression problem on ADD...

2014-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/13#issuecomment-36218095 Thanks I've merged this into master and 0.9. Actually didn't notice tests hadn't gone through. We can revert if there are any issues. Jenkins, test this please. --- If y

[GitHub] spark pull request: [SPARK-1089] fix the regression problem on ADD...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13#issuecomment-36218110 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1089] fix the regression problem on ADD...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/13#issuecomment-36218102 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36217973 Merged !! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36217654 Rebased !! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enab

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36217566 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36217568 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12908/ --- If your project is

[GitHub] spark pull request: SPARK-1125: When using a http proxy,the maven ...

2014-02-26 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/25#issuecomment-36217136 I'm still confused why you are posting this pull request. You found this was a problem with your local proxy. This change does not fix that at all. Nor would any change that

[GitHub] spark pull request: Show Master status on UI page

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/24#issuecomment-36217109 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Remove references to ClusterScheduler (SPARK-1...

2014-02-26 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled

[GitHub] spark pull request: Show Master status on UI page

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/24#issuecomment-36217111 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12907/ --- If your project i

[GitHub] spark pull request: Updated more links in documentation

2014-02-26 Thread jyotiska
Github user jyotiska commented on the pull request: https://github.com/apache/spark/pull/23#issuecomment-36216611 Right, didn't notice that PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1125: When using a http proxy,the maven ...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/25#issuecomment-36216306 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proje

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36216313 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have thi

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36216310 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36216320 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1125: When using a http proxy,the maven ...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/25#issuecomment-36216317 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proje

[GitHub] spark pull request: SPARK-1125: When using a http proxy,the maven ...

2014-02-26 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/25 SPARK-1125: When using a http proxy,the maven build error for Spark Examples building with maven When using a http proxy, throw Failure to find org.eclipse.paho:mqtt-client:jar:0.4.0 in https://reposit

[GitHub] spark pull request: Updated more links in documentation

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/23#issuecomment-36216059 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Updated more links in documentation

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/23#issuecomment-36216060 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12906/ --- If your project i

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36215951 Thanks @ScrapCodes looks good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36215972 hmm... apears it does not merge cleanly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: Updated more links in documentation

2014-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/23#issuecomment-36215928 I think this is redundant with #2 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: Show Master status on UI page

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/24#issuecomment-36215913 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Show Master status on UI page

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/24#issuecomment-36215905 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Show Master status on UI page

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/24#issuecomment-36215906 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Remove references to ClusterScheduler (SPARK-1...

2014-02-26 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/9#issuecomment-36215860 Thanks, merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Show Master status on UI page

2014-02-26 Thread colorant
GitHub user colorant opened a pull request: https://github.com/apache/spark/pull/24 Show Master status on UI page For standalone HA mode, A status is useful to identify the current master, already in json format too. You can merge this pull request into a Git repository by running:

[GitHub] spark pull request: Updated more links in documentation

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/23#issuecomment-36214553 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Updated more links in documentation

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/23#issuecomment-36214523 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Updated more links in documentation

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/23#issuecomment-36214522 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36214463 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12905/ --- If your project is

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36214461 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: Updated more links in documentation

2014-02-26 Thread jyotiska
GitHub user jyotiska opened a pull request: https://github.com/apache/spark/pull/23 Updated more links in documentation You can merge this pull request into a Git repository by running: $ git pull https://github.com/jyotiska/spark pyspark_docs2 Alternatively you can review an

[GitHub] spark pull request: Updated link for pyspark examples in docs

2014-02-26 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/22 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

Re: How to run a single test suite?

2014-02-26 Thread Bryn Keller
Thanks, that was it! On Wed, Feb 26, 2014 at 10:05 PM, Reynold Xin wrote: > You put your quotes in the wrong place. See > https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools > > > > On Wed, Feb 26, 2014 at 10:04 PM, Bryn Keller wrote: > > > Hi Folks, > > > > I've tried usi

Re: How to run a single test suite?

2014-02-26 Thread Reynold Xin
You put your quotes in the wrong place. See https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools On Wed, Feb 26, 2014 at 10:04 PM, Bryn Keller wrote: > Hi Folks, > > I've tried using "sbt test-only '*PairRDDFunctionsSuite'" to run only that > test suite, which is what I thi

How to run a single test suite?

2014-02-26 Thread Bryn Keller
Hi Folks, I've tried using "sbt test-only '*PairRDDFunctionsSuite'" to run only that test suite, which is what I think is supposed to work with ScalaTest. I have also tried the variant with the fully qualified name spelled out as well. No matter what I try, it always runs *all* the test suites, wh

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36212897 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36212896 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark pull request: SPARK-1121 Only add avro if the build is for H...

2014-02-26 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/6#issuecomment-36212905 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Mridul Muralidharan
On Feb 26, 2014 11:12 PM, "Patrick Wendell" wrote: > > @mridul - As far as I know both Maven and Sbt use fairly similar > processes for building the assembly/uber jar. We actually used to > package spark with sbt and there were no specific issues we > encountered and AFAIK sbt respects versioning

Re: [IMPORTANT] Github/jenkins migration

2014-02-26 Thread Koert Kuipers
Thanks On Feb 26, 2014 7:24 PM, "Patrick Wendell" wrote: > You need to fork the new apache repository. > > 1. Fork https://github.com/apache/spark/ in github > 2. Add your own fork as a remote in your local git > ===> git remote add apache-pwendell g...@github.com:pwendell/spark.git > 3. Push you

Re: [IMPORTANT] Github/jenkins migration

2014-02-26 Thread Patrick Wendell
You need to fork the new apache repository. 1. Fork https://github.com/apache/spark/ in github 2. Add your own fork as a remote in your local git ===> git remote add apache-pwendell g...@github.com:pwendell/spark.git 3. Push your local branch the fork on github. 4. Make a pull request from your fo

Re: [IMPORTANT] Github/jenkins migration

2014-02-26 Thread Koert Kuipers
github is not aware of the new repo being a "base-fork", so its not easy to re-point pull requests. i am guessing it didnt get cloned from the incubator spark one? On Wed, Feb 26, 2014 at 5:56 PM, Patrick Wendell wrote: > Sorry if this wasn't clear - If you are in the middle of a review > close

Re: [IMPORTANT] Github/jenkins migration

2014-02-26 Thread Patrick Wendell
Sorry if this wasn't clear - If you are in the middle of a review close it and re-open it in against [1]. The reason is we can't test your changes against incubator-spark because it no longer exists. [1] https://github.com/apache/spark - Patrick On Wed, Feb 26, 2014 at 2:45 PM, Nan Zhu wrote: >

Re: [IMPORTANT] Github/jenkins migration

2014-02-26 Thread Nan Zhu
Hi, Patrick, How to deal with the active pull requests in the old repository? The contributors have to do something? Best, -- Nan Zhu On Wednesday, February 26, 2014 at 5:37 PM, Patrick Wendell wrote: Hey All, The github incubator-spark mirror has been migrated to [1] by Apache infra and w

[IMPORTANT] Github/jenkins migration

2014-02-26 Thread Patrick Wendell
Hey All, The github incubator-spark mirror has been migrated to [1] by Apache infra and we've migrated Jenkins to reflect the new changes. This means the existing "incubator-spark" mirror is becoming outdated and no longer correctly displays pull request diff's. We've asked apache infra to see if

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Nathan Kronenfeld
On Wed, Feb 26, 2014 at 2:11 PM, Sean Owen wrote: > I also favor Maven. I don't the the logic is "because it's common". As > Sandy says, it's because of the things that brings: more plugins, > easier to consume by more developers, etc. These are, however, just > some reasons 'for', and have to be

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Koert Kuipers
yes. the Build.scala file behaves like a configuration file mostly, but because it is scala you can use the full power of a real language when needed. also i found writing sbt plugins doable (but not easy). On Feb 26, 2014 2:12 PM, "Sean Owen" wrote: > I also favor Maven. I don't the the logic

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Evan Chan
Can't maven pom's include other ones? So what if we remove the artifact specs from the main pom, have them generated by sbt make-pom, and include the generated file in the main pom.xml?I guess, just trying to figure out how much this would help (it seems at least it would remove the issue of m

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Mark Hamstra
Yes, but the POM generated in that fashion is only sufficient for linking with Spark, not for building Spark or serving as a basis from which to build a customized Spark with Maven. So, starting from SparkBuild.scala and generating a POM with make-pom, those who wish to build a customized Spark wi

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Evan Chan
Mark, No, I haven't tried this myself yet :-p Also I would expect that sbt-pom-reader does not do assemblies at all because that is an SBT plugin, so we would still need code to include sbt-assembly. There is also the trick question of how to include the assembly stuff into sbt-pom-reader

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Sean Owen
I also favor Maven. I don't the the logic is "because it's common". As Sandy says, it's because of the things that brings: more plugins, easier to consume by more developers, etc. These are, however, just some reasons 'for', and have to be considered against the other pros and cons. The choice of

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Koert Kuipers
i dont buy the argument that we should use it because its the most common. if all we would do is use what is most common then we should switch to java, svn and maven On Wed, Feb 26, 2014 at 1:38 PM, Mark Grover wrote: > Hi Patrick, > And, to pile on what Sandy said. In my opinion, it's definit

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Mark Hamstra
Evan, Have you actually tried to build Spark using its POM file and sbt-pom-reader? I just made a first, naive attempt, and I'm still sorting through just what this did and didn't produce. It looks like the basic jar files are at least very close to correct, and may be just fine, but that buildi

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Mark Grover
Hi Patrick, And, to pile on what Sandy said. In my opinion, it's definitely more than just a matter of convenience. My comment below applies both to distribution builders but also people who have their own internal "distributions" (a few examples of which we have already seen on this thread already

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Sandy Ryza
@patrick - It seems like my point about being able to inherit the root pom was addressed and there's a way to handle this. The larger point I meant to make is that Maven is by far the most common build tool in projects that are likely to share contributors with Spark. I personally know 10 people

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Patrick Wendell
@mridul - As far as I know both Maven and Sbt use fairly similar processes for building the assembly/uber jar. We actually used to package spark with sbt and there were no specific issues we encountered and AFAIK sbt respects versioning of transitive dependencies correctly. Do you have a specific b

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Evan Chan
I'd like to propose the following way to move forward, based on the comments I've seen: 1. Aggressively clean up the giant dependency graph. One ticket I might work on if I have time is SPARK-681 which might remove the giant fastutil dependency (~15MB by itself). 2. Take an intermediate step

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Koert Kuipers
We maintain in house spark build using sbt. We have no problem using sbt assembly. We did add a few exclude statements for transitive dependencies. The main enemy of assemblies are jars that include stuff they shouldn't (kryo comes to mind, I think they include logback?), new versions of jars that

Discussion on SPARK-1139

2014-02-26 Thread Nan Zhu
Hi, all I just created a JIRA https://spark-project.atlassian.net/browse/SPARK-1139 . The issue discusses that: the new Hadoop API based Spark APIs are actually a mixture of old and new Hadoop API. Spark APIs are still using JobConf (or Configuration) as one of the parameters, but actually

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-02-26 Thread Sean Owen
Side point -- "provides" scope is not the same as an exclude. "provides" means, this artifact is used directly by this code (compile time), but it is not necessary to package it, since it will be available from a runtime container. Exclusions make an artifact, that would otherwise be available, una