You can add classpath info in hadoop env file...
Add the following line to your $HADOOP_HOME/etc/hadoop/hadoop-env.sh
export
HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_HOME/share/hadoop/tools/lib/*
Add the following line to $SPARK_HOME/conf/spark-env.sh
export SPARK_DIST_CLASSPATH=$($HADOOP_HOME/
hmm I tried using --jars and that got passed to MasterArguments and that
doesn't work :-(
https://github.com/apache/spark/blob/branch-1.5/core/src/main/scala/org/apache/spark/deploy/master/MasterArguments.scala
Same with Worker:
https://github.com/apache/spark/blob/branch-1.5/core/src/main/scala/
> On 15 Oct 2015, at 19:04, Scott Reynolds wrote:
>
> List,
>
> Right now we build our spark jobs with the s3a hadoop client. We do this
> because our machines are only allowed to use IAM access to the s3 store. We
> can build our jars with the s3a filesystem and the aws sdk just fine and thi
You can use spark 1.5.1 with no hadoop and hadoop 2.7.1..
Hadoop 2.7.1 is more mature for s3a access. You also need to set hadoop
tools dir into hadoop classpath...
Raghav
On Oct 16, 2015 1:09 AM, "Scott Reynolds" wrote:
> We do not use EMR. This is deployed on Amazon VMs
>
> We build Spark with
We do not use EMR. This is deployed on Amazon VMs
We build Spark with Hadoop-2.6.0 but that does not include the s3a
filesystem nor the Amazon AWS SDK
On Thu, Oct 15, 2015 at 12:26 PM, Spark Newbie
wrote:
> Are you using EMR?
> You can install Hadoop-2.6.0 along with Spark-1.5.1 in your EMR clu
Are you using EMR?
You can install Hadoop-2.6.0 along with Spark-1.5.1 in your EMR cluster.
And that brings s3a jars to the worker nodes and it becomes available to
your application.
On Thu, Oct 15, 2015 at 11:04 AM, Scott Reynolds
wrote:
> List,
>
> Right now we build our spark jobs with the s3