Hi , I am getting the following error while reading the huge data from S3 and
after processing ,writing data to S3 again.

Did you find any solution for this ?

16/02/07 07:41:59 WARN scheduler.TaskSetManager: Lost task 144.2 in stage
3.0 (TID 169, ip-172-31-7-26.us-west-2.compute.internal):
java.io.IOException: exception in uploadSinglePart
        at
com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream.uploadSinglePart(MultipartUploadOutputStream.java:248)
        at
com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream.close(MultipartUploadOutputStream.java:469)
        at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
        at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:105)
        at
org.apache.hadoop.io.compress.CompressorStream.close(CompressorStream.java:106)
        at java.io.FilterOutputStream.close(FilterOutputStream.java:160)
        at
org.apache.hadoop.mapred.TextOutputFormat$LineRecordWriter.close(TextOutputFormat.java:109)
        at
org.apache.spark.SparkHadoopWriter.close(SparkHadoopWriter.scala:102)
        at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1080)
        at
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1059)
        at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
        at org.apache.spark.scheduler.Task.run(Task.scala:64)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: exception in putObject
        at
com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.storeFile(Jets3tNativeFileSystemStore.java:149)
        at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
        at com.sun.proxy.$Proxy26.storeFile(Unknown Source)
        at
com.amazon.ws.emr.hadoop.fs.s3n.MultipartUploadOutputStream.uploadSinglePart(MultipartUploadOutputStream.java:245)
        ... 15 more
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: The
Content-MD5 you specified did not match what we received. (Service: Amazon
S3; Status Code: 400; Error Code: BadDigest; Request ID: 5918216A5901FCC8),
S3 Extended Request ID:
QSxtYln/yXqHYpdr4BWosin/TAFsGlK1FlKfE5PcuJkNrgoblGzTNt74kEhuNcrJCRZ3mXq0oUo=
        at
com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1182)
        at
com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:770)
        at
com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)
        at
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:310)
        at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3796)
        at
com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1482)
        at
com.amazon.ws.emr.hadoop.fs.s3n.Jets3tNativeFileSystemStore.storeFile(Jets3tNativeFileSystemStore.java:140)
        ... 22 more





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Bad-Digest-error-while-doing-aws-s3-put-tp10036p26167.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to