Try to download this jar:
http://search.maven.org/remotecontent?filepath=org/apache/hadoop/hadoop-aws/2.6.0/hadoop-aws-2.6.0.jar
And add:
export CLASSPATH=$CLASSPATH:hadoop-aws-2.6.0.jar
And try to relaunch.
Thanks,
Peter Rudenko
On 2015-05-07 19:30, Nicholas Chammas wrote:
Hmm, I just tried changing |s3n| to |s3a|:
|py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.collectAndServe. :
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class
org.apache.hadoop.fs.s3a.S3AFileSystem not found |
Nick
On Thu, May 7, 2015 at 12:29 PM Peter Rudenko <petro.rude...@gmail.com
<mailto:petro.rude...@gmail.com>> wrote:
Hi Nick, had the same issue.
By default it should work with s3a protocol:
sc.textFile('s3a://bucket/file_*').count()
If you want to use s3n protocol you need to add hadoop-aws.jar to
spark's classpath. Wich hadoop vendor (Hortonworks, Cloudera,
MapR) do you use?
Thanks,
Peter Rudenko
On 2015-05-07 19:25, Nicholas Chammas wrote:
Details are here:https://issues.apache.org/jira/browse/SPARK-7442
It looks like something specific to building against Hadoop 2.6?
Nick