I think your observation is correct.
e.g.
http://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10/1.3.1
shows that it depends on hadoop-client
<http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client> from
hadoop 2.2

Cheers

On Tue, May 19, 2015 at 6:17 PM, Edward Sargisson <esa...@pobox.com> wrote:

> Hi,
> I'd like to confirm an observation I've just made. Specifically that spark
> is only available in repo1.maven.org for one Hadoop variant.
>
> The Spark source can be compiled against a number of different Hadoops
> using profiles. Yay.
> However, the spark jars in repo1.maven.org appear to be compiled against
> one specific Hadoop and no other differentiation is made. (I can see a
> difference with hadoop-client being 2.2.0 in repo1.maven.org and 1.0.4 in
> the version I compiled locally).
>
> The implication here is that if you have a pom file asking for
> spark-core_2.10 version 1.3.1 then Maven will only give you an Hadoop 2
> version. Maven assumes that non-snapshot artifacts never change so trying
> to load an Hadoop 1 version will end in tears.
>
> This then means that if you compile code against spark-core then there
> will probably be classpath NoClassDefFound issues unless the Hadoop 2
> version is exactly the one you want.
>
> Have I gotten this correct?
>
> It happens that our little app is using a Spark context directly from a
> Jetty webapp and the classpath differences were/are causing some confusion.
> We are currently installing a Hadoop 1 spark master and worker.
>
> Thanks a lot!
> Edward
>

Reply via email to