Re: Spark <--> S3 flakiness

[email protected] Thu, 11 May 2017 11:51:15 -0700

Also, and this is unrelated to the actual question... Why don't these
messages show up in the archive?


http://apache-spark-user-list.1001560.n3.nabble.com/

Ideally I'd want to post a link to our internal wiki for these questions,
but can't find them in the archive.

On 11 May 2017 at 07:16, [email protected] <[email protected]> wrote:

> Looks like this isn't viable in spark 2.0.0 (and greater I presume).  I'm
> pretty sure I came across this blog and ignored it due to that.
>
> Any other thoughts?  The linked tickets in: https://issues.apache.org/
> jira/browse/SPARK-10063 https://issues.apache.org/jira/browse/HADOOP-13786
>  https://issues.apache.org/jira/browse/HADOOP-9565 look relevant too.
>
> On 10 May 2017 at 22:24, Miguel Morales <[email protected]> wrote:
>
>> Try using the DirectParquetOutputCommiter:
>> http://dev.sortable.com/spark-directparquetoutputcommitter/
>>
>> On Wed, May 10, 2017 at 10:07 PM, [email protected]
>> <[email protected]> wrote:
>> > Hi users, we have a bunch of pyspark jobs that are using S3 for loading
>> /
>> > intermediate steps and final output of parquet files.
>> >
>> > We're running into the following issues on a semi regular basis:
>> > * These are intermittent errors, IE we have about 300 jobs that run
>> > nightly... And a fairly random but small-ish percentage of them fail
>> with
>> > the following classes of errors.
>> >
>> > S3 write errors
>> >
>> >> "ERROR Utils: Aborting task
>> >> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 404,
>> AWS
>> >> Service: Amazon S3, AWS Request ID: 2D3RP, AWS Error Code: null, AWS
>> Error
>> >> Message: Not Found, S3 Extended Request ID: BlaBlahEtc="
>> >
>> >
>> >>
>> >> "Py4JJavaError: An error occurred while calling o43.parquet.
>> >> : com.amazonaws.services.s3.model.MultiObjectDeleteException: Status
>> Code:
>> >> 0, AWS Service: null, AWS Request ID: null, AWS Error Code: null, AWS
>> Error
>> >> Message: One or more objects could not be deleted, S3 Extended Request
>> ID:
>> >> null"
>> >
>> >
>> >
>> > S3 Read Errors:
>> >
>> >> [Stage 1:=================================================>       (27
>> + 4)
>> >> / 31]17/05/10 16:25:23 ERROR Executor: Exception in task 10.0 in stage
>> 1.0
>> >> (TID 11)
>> >> java.net.SocketException: Connection reset
>> >> at java.net.SocketInputStream.read(SocketInputStream.java:196)
>> >> at java.net.SocketInputStream.read(SocketInputStream.java:122)
>> >> at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
>> >> at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:554)
>> >> at sun.security.ssl.InputRecord.read(InputRecord.java:509)
>> >> at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:927)
>> >> at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.
>> java:884)
>> >> at sun.security.ssl.AppInputStream.read(AppInputStream.java:102)
>> >> at
>> >> org.apache.http.impl.io.AbstractSessionInputBuffer.read(Abst
>> ractSessionInputBuffer.java:198)
>> >> at
>> >> org.apache.http.impl.io.ContentLengthInputStream.read(Conten
>> tLengthInputStream.java:178)
>> >> at
>> >> org.apache.http.impl.io.ContentLengthInputStream.read(Conten
>> tLengthInputStream.java:200)
>> >> at
>> >> org.apache.http.impl.io.ContentLengthInputStream.close(Conte
>> ntLengthInputStream.java:103)
>> >> at
>> >> org.apache.http.conn.BasicManagedEntity.streamClosed(BasicMa
>> nagedEntity.java:168)
>> >> at
>> >> org.apache.http.conn.EofSensorInputStream.checkClose(EofSens
>> orInputStream.java:228)
>> >> at
>> >> org.apache.http.conn.EofSensorInputStream.close(EofSensorInp
>> utStream.java:174)
>> >> at java.io.FilterInputStream.close(FilterInputStream.java:181)
>> >> at java.io.FilterInputStream.close(FilterInputStream.java:181)
>> >> at java.io.FilterInputStream.close(FilterInputStream.java:181)
>> >> at java.io.FilterInputStream.close(FilterInputStream.java:181)
>> >> at com.amazonaws.services.s3.model.S3Object.close(S3Object.java:203)
>> >> at org.apache.hadoop.fs.s3a.S3AInputStream.close(S3AInputStream
>> .java:187)
>> >
>> >
>> >
>> > We have literally tons of logs we can add but it would make the email
>> > unwieldy big.  If it would be helpful I'll drop them in a pastebin or
>> > something.
>> >
>> > Our config is along the lines of:
>> >
>> > spark-2.1.0-bin-hadoop2.7
>> > '--packages
>> > com.amazonaws:aws-java-sdk:1.10.34,org.apache.hadoop:hadoop-aws:2.6.0
>> > pyspark-shell'
>> >
>> > Given the stack overflow / googling I've been doing I know we're not the
>> > only org with these issues but I haven't found a good set of solutions
>> in
>> > those spaces yet.
>> >
>> > Thanks!
>> >
>> > Gary Lucas
>>
>
>

Re: Spark <--> S3 flakiness

Reply via email to