Hi Robert, I’m not sure about sbt; we’re currently using Maven to build. We do create a single jar though, via the Maven shade plugin. Our project has three components, and we routinely distribute the jar for our project’s CLI out across a cluster. If you’re interested, here are our project’s master pom and the pom for our CLI. There are a few dependencies that we exclude from hadoop-client:
• asm/asm • org.jboss.netty/netty • org.codehaus.jackson/* • org.sonatype.sisu.inject/* We've built and ran this successfully across both Hadoop 1.0.4 and 2.2.0-2.2.5. Regards, Frank Austin Nothaft fnoth...@berkeley.edu fnoth...@eecs.berkeley.edu 202-340-0466 On Jun 29, 2014, at 4:20 PM, Robert James <srobertja...@gmail.com> wrote: > On 6/29/14, FRANK AUSTIN NOTHAFT <fnoth...@berkeley.edu> wrote: >> Robert, >> >> You can build a Spark application using Maven for Hadoop 2 by adding a >> dependency on the Hadoop 2.* hadoop-client package. If you define any >> Hadoop Input/Output formats, you may also need to depend on the >> hadoop-mapreduce package. > > Thank you Frank. Is it possible to do sbt-assembly after that? I get > conflicts, because Spark requires via Maven Hadoop 1. I've tried > excluding that via sbt, but still get conflicts within Hadoop 2, with > different components requiring different versions of other jars. > > Is it possible to make a jar assembly using your approach? How? If > not: How do you distribute the jars to the workers? > >> >> On Sun, Jun 29, 2014 at 12:20 PM, Robert James <srobertja...@gmail.com> >> wrote: >> >>> Although Spark's home page offers binaries for Spark 1.0.0 with Hadoop >>> 2, the Maven repository only seems to have one version, which uses >>> Hadoop 1. >>> >>> Is it possible to use a Maven link and Hadoop 2? What is the id? >>> >>> If not: How can I use the prebuilt binaries to use Hadoop 2? Do I just >>> copy the lib/ dir into my classpath? >>> >>