Re: problems with build of latest the master

Steve Loughran Thu, 16 Jul 2015 02:15:29 -0700

Patching hadoop's build will fix this long term, but not until Hadoop-2.7.2



I think just adding the openstack JAR to the spark classpath should be enough 
to pick this up, which the --jars command can do with ease

On that topic, one thing I would like to see (knowing what it takes to get 
azure and s3a support in), would be the spark shell scripts to auto-add 
everything
in SPARK_HOME/lib to the CP (or at least having the option to do this). Because 
spark-class doesn't do that, run-example has to have its own code to add
the examples jar.

Has this been discussed before? Even having an env var that spark-class, 
pyspark & others could pick up would be enough


> On 15 Jul 2015, at 20:07, Sean Owen <so...@cloudera.com> wrote:
> 
> Why does Spark need to depend on it? I'm missing that bit. If an
> openstack artifact is needed for openstack, shouldn't openstack add
> it? otherwise everybody gets it in their build.
> 
> On Wed, Jul 15, 2015 at 7:52 PM, Gil Vernik <g...@il.ibm.com> wrote:
>> I mean currently users that wish to use Spark and configure Spark to use
>> OpenStack Swift need to manually edit pom.xml of Spark ( main, core, yarn )
>> and add hadoop-openstack.jar to it and then compile Spark.
>> My question is why not to include this dependency in Spark for Hadoop
>> profiles 2.4 and up? ( hadoop-openstack.jar exists for 2.4 and upper
>> versions )
>> 
>> I think when we first integrated Spark  + OpenStack Swift there were no
>> profiles in Spark and so it was problematic to include this dependency
>> there. But now it seems to be easy to achieve, since we have hadoop profiles
>> in the poms.
>> 
>> 
>> 
>> From:        Sean Owen <so...@cloudera.com>
>> To:        Gil Vernik/Haifa/IBM@IBMIL
>> Cc:        Ted Yu <yuzhih...@gmail.com>, Dev <dev@spark.apache.org>, Josh
>> Rosen <joshro...@databricks.com>, Steve Loughran <ste...@hortonworks.com>
>> Date:        15/07/2015 21:41
>> Subject:        Re: problems with build of latest the master
>> ________________________________
>> 
>> 
>> 
>> You shouldn't get dependencies you need from Spark, right? you declare
>> direct dependencies. Are we talking about re-scoping or excluding this
>> dep from Hadoop transitively?
>> 
>> On Wed, Jul 15, 2015 at 7:33 PM, Gil Vernik <g...@il.ibm.com> wrote:
>>> Right, it's not currently dependence in Spark.
>>> If we already mention it, is it possible to make it part of current
>>> dependence, but only for Hadoop profiles 2.4 and up?
>>> This will solve a lot of headache to those who use Spark + OpenStack Swift
>>> and need every time to manually edit pom.xml to add dependence of it.
>>> 
>>> 
>>> 
>>> From:        Ted Yu <yuzhih...@gmail.com>
>>> To:        Josh Rosen <joshro...@databricks.com>
>>> Cc:        Steve Loughran <ste...@hortonworks.com>, Gil
>>> Vernik/Haifa/IBM@IBMIL, Dev <dev@spark.apache.org>
>>> Date:        15/07/2015 18:28
>>> Subject:        Re: problems with build of latest the master
>>> ________________________________
>>> 
>>> 
>>> 
>>> If I understand correctly, hadoop-openstack is not currently dependence in
>>> Spark.
>>> 
>>> 
>>> 
>>> On Jul 15, 2015, at 8:21 AM, Josh Rosen <joshro...@databricks.com> wrote:
>>> 
>>> We may be able to fix this from the Spark side by adding appropriate
>>> exclusions in our Hadoop dependencies, right?  If possible, I think that
>>> we
>>> should do this.
>>> 
>>> On Wed, Jul 15, 2015 at 7:10 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>>> I attached a patch for HADOOP-12235
>>> 
>>> BTW openstack was not mentioned in the first email from Gil.
>>> My email and Gil's second email were sent around the same moment.
>>> 
>>> Cheers
>>> 
>>> On Wed, Jul 15, 2015 at 2:06 AM, Steve Loughran <ste...@hortonworks.com>
>>> wrote:
>>> 
>>> On 14 Jul 2015, at 12:22, Ted Yu <yuzhih...@gmail.com> wrote:
>>> 
>>> Looking at Jenkins, master branch compiles.
>>> 
>>> Can you try the following command ?
>>> 
>>> mvn -Phive -Phadoop-2.6 -DskipTests clean package
>>> 
>>> What version of Java are you using ?
>>> 
>>> Ted, Giles has stuck in hadoop-openstack, it's that which is creating the
>>> problem
>>> 
>>> Giles, I don't know why hadoop-openstack has a mockito dependency as  it
>>> should be test time only
>>> 
>>> Looking at the POM it's tag
>>> 
>>> in hadoop-2.7 tis scoped to compile, which
>>>    <dependency>
>>>      <groupId>org.mockito</groupId>
>>>      <artifactId>mockito-all</artifactId>
>>>      <scope>compile</scope>
>>>    </dependency>
>>> 
>>> it should be "provided", shouldn't it?
>>> 
>>> Created https://issues.apache.org/jira/browse/HADOOP-12235 : if someone
>>> supplies a patch I'll get it in.
>>> 
>>> -steve
>>> 
>>> 
>>> 
>> 
>> 
>> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: problems with build of latest the master

Reply via email to