Yep it's a Hadoop issue: https://issues.apache.org/jira/browse/HADOOP-11863

http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CCA+XUwYxPxLkfhOxn1jNkoUKEQQMcPWFzvXJ=u+kp28kdejo...@mail.gmail.com%3E
http://stackoverflow.com/a/28033408/3271168


So for now need to manually add that jar to classpath on hadoop-2.6.

Thanks,
Peter Rudenko

On 2015-05-07 19:41, Nicholas Chammas wrote:
I can try that, but the issue is I understand this is supposed to work out of the box (like it does with all the other Spark/Hadoop pre-built packages).

On Thu, May 7, 2015 at 12:35 PM Peter Rudenko <petro.rude...@gmail.com <mailto:petro.rude...@gmail.com>> wrote:

    Try to download this jar:
    
http://search.maven.org/remotecontent?filepath=org/apache/hadoop/hadoop-aws/2.6.0/hadoop-aws-2.6.0.jar

    And add:

    export CLASSPATH=$CLASSPATH:hadoop-aws-2.6.0.jar

    And try to relaunch.

    Thanks,
    Peter Rudenko


    On 2015-05-07 19:30, Nicholas Chammas wrote:

    Hmm, I just tried changing |s3n| to |s3a|:

    |py4j.protocol.Py4JJavaError: An error occurred while calling
    z:org.apache.spark.api.python.PythonRDD.collectAndServe. :
    java.lang.RuntimeException: java.lang.ClassNotFoundException:
    Class org.apache.hadoop.fs.s3a.S3AFileSystem not found |

    Nick

    ​

    On Thu, May 7, 2015 at 12:29 PM Peter Rudenko
    <petro.rude...@gmail.com <mailto:petro.rude...@gmail.com>> wrote:

        Hi Nick, had the same issue.
        By default it should work with s3a protocol:

        sc.textFile('s3a://bucket/file_*').count()


        If you want to use s3n protocol you need to add
        hadoop-aws.jar to spark's classpath. Wich hadoop vendor
        (Hortonworks, Cloudera, MapR) do you use?

        Thanks,
        Peter Rudenko

        On 2015-05-07 19:25, Nicholas Chammas wrote:
        Details are here:https://issues.apache.org/jira/browse/SPARK-7442

        It looks like something specific to building against Hadoop 2.6?

        Nick




Reply via email to