Patching hadoop's build will fix this long term, but not until Hadoop-2.7.2
I think just adding the openstack JAR to the spark classpath should be enough to pick this up, which the --jars command can do with ease On that topic, one thing I would like to see (knowing what it takes to get azure and s3a support in), would be the spark shell scripts to auto-add everything in SPARK_HOME/lib to the CP (or at least having the option to do this). Because spark-class doesn't do that, run-example has to have its own code to add the examples jar. Has this been discussed before? Even having an env var that spark-class, pyspark & others could pick up would be enough > On 15 Jul 2015, at 20:07, Sean Owen <so...@cloudera.com> wrote: > > Why does Spark need to depend on it? I'm missing that bit. If an > openstack artifact is needed for openstack, shouldn't openstack add > it? otherwise everybody gets it in their build. > > On Wed, Jul 15, 2015 at 7:52 PM, Gil Vernik <g...@il.ibm.com> wrote: >> I mean currently users that wish to use Spark and configure Spark to use >> OpenStack Swift need to manually edit pom.xml of Spark ( main, core, yarn ) >> and add hadoop-openstack.jar to it and then compile Spark. >> My question is why not to include this dependency in Spark for Hadoop >> profiles 2.4 and up? ( hadoop-openstack.jar exists for 2.4 and upper >> versions ) >> >> I think when we first integrated Spark + OpenStack Swift there were no >> profiles in Spark and so it was problematic to include this dependency >> there. But now it seems to be easy to achieve, since we have hadoop profiles >> in the poms. >> >> >> >> From: Sean Owen <so...@cloudera.com> >> To: Gil Vernik/Haifa/IBM@IBMIL >> Cc: Ted Yu <yuzhih...@gmail.com>, Dev <dev@spark.apache.org>, Josh >> Rosen <joshro...@databricks.com>, Steve Loughran <ste...@hortonworks.com> >> Date: 15/07/2015 21:41 >> Subject: Re: problems with build of latest the master >> ________________________________ >> >> >> >> You shouldn't get dependencies you need from Spark, right? you declare >> direct dependencies. Are we talking about re-scoping or excluding this >> dep from Hadoop transitively? >> >> On Wed, Jul 15, 2015 at 7:33 PM, Gil Vernik <g...@il.ibm.com> wrote: >>> Right, it's not currently dependence in Spark. >>> If we already mention it, is it possible to make it part of current >>> dependence, but only for Hadoop profiles 2.4 and up? >>> This will solve a lot of headache to those who use Spark + OpenStack Swift >>> and need every time to manually edit pom.xml to add dependence of it. >>> >>> >>> >>> From: Ted Yu <yuzhih...@gmail.com> >>> To: Josh Rosen <joshro...@databricks.com> >>> Cc: Steve Loughran <ste...@hortonworks.com>, Gil >>> Vernik/Haifa/IBM@IBMIL, Dev <dev@spark.apache.org> >>> Date: 15/07/2015 18:28 >>> Subject: Re: problems with build of latest the master >>> ________________________________ >>> >>> >>> >>> If I understand correctly, hadoop-openstack is not currently dependence in >>> Spark. >>> >>> >>> >>> On Jul 15, 2015, at 8:21 AM, Josh Rosen <joshro...@databricks.com> wrote: >>> >>> We may be able to fix this from the Spark side by adding appropriate >>> exclusions in our Hadoop dependencies, right? If possible, I think that >>> we >>> should do this. >>> >>> On Wed, Jul 15, 2015 at 7:10 AM, Ted Yu <yuzhih...@gmail.com> wrote: >>> I attached a patch for HADOOP-12235 >>> >>> BTW openstack was not mentioned in the first email from Gil. >>> My email and Gil's second email were sent around the same moment. >>> >>> Cheers >>> >>> On Wed, Jul 15, 2015 at 2:06 AM, Steve Loughran <ste...@hortonworks.com> >>> wrote: >>> >>> On 14 Jul 2015, at 12:22, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>> Looking at Jenkins, master branch compiles. >>> >>> Can you try the following command ? >>> >>> mvn -Phive -Phadoop-2.6 -DskipTests clean package >>> >>> What version of Java are you using ? >>> >>> Ted, Giles has stuck in hadoop-openstack, it's that which is creating the >>> problem >>> >>> Giles, I don't know why hadoop-openstack has a mockito dependency as it >>> should be test time only >>> >>> Looking at the POM it's tag >>> >>> in hadoop-2.7 tis scoped to compile, which >>> <dependency> >>> <groupId>org.mockito</groupId> >>> <artifactId>mockito-all</artifactId> >>> <scope>compile</scope> >>> </dependency> >>> >>> it should be "provided", shouldn't it? >>> >>> Created https://issues.apache.org/jira/browse/HADOOP-12235 : if someone >>> supplies a patch I'll get it in. >>> >>> -steve >>> >>> >>> >> >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org