Hey Cliff!

I was able to reproduce this by locally running a job and RocksDB semi
asynchronous checkpoints (current default) to S3A. I've created an
issue here: https://issues.apache.org/jira/browse/FLINK-4228.

Running with S3N it is working as expected. You can use that
implementation as a work around. I don't know whether it's possible to
disable creation of MD5 hashes for S3A.

– Ufuk

On Sat, Jul 16, 2016 at 6:26 PM, Clifford Resnick
<cresn...@mediamath.com> wrote:
> Using Flink 1.1-SNAPSHOT, Hadoop-aws 2.6.4
>
>
>
> The error I’m getting is :
>
>
>
> 11:05:44,425 ERROR org.apache.flink.streaming.runtime.tasks.StreamTask
> - Caught exception while materializing asynchronous checkpoints.
>
> com.amazonaws.AmazonClientException: Unable to calculate MD5 hash:
> /var/folders/t8/k5764ltj4sq4ft06c1zp0nxn928mwr/T/flink-io-247956be-e422-4222-a512-e3ae321b1590/ede87211c622f86d1ef7b2b323076e79/WindowOperator_10_3/dummy_state/31b7ca7b-dc94-4d40-84c7-4f10ebc644a2/local-chk-1
> (Is a directory)
>
>                 at
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1266)
>
>                 at
> com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:131)
>
>                 at
> com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
>
>                 at
> com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
>
>                 at
> com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
>
>                 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
>                 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
>                 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
>                 at java.lang.Thread.run(Thread.java:745)
>
>
>
> In the debugger I noticed that some of the uploaded checkpoints are from the
> configured /tmp location. These succeed as file in the request is fully
> qualified, but I guess it’s different for WindowOperators? Here the file in
> the request (using a different /var/folders.. location not configured by me
> – must be a mac thing?) is actually a directory. The AWS api is failing when
> it tries to calculate an MD5 of the directory. The Flink side of the
> codepath is hard to discern from debugging because it’s asynchronous.
>
>
>
> I get the same issue whether local or on a CentOs- based YARN cluster.
> Everything works if I use HDFS instead. Any insight will be greatly
> appreciated! When I get a chance later I may try S3n or perhaps S3a with MD5
> verification skipped.
>
>
>
> -Cliff
>
>
>
>
>
>
>
>

Reply via email to