Regarding S3 and the Rolling/BucketingSink, we've seen data loss when
resuming from checkpoints, as S3 FileSystem implementations flush to
temporary files while the RollingSink expects a direct flush to in-progress
files. Because there is no such think as "flush and resume writing" to S3,
I don't k
Hi!
The "truncate()" functionality is only needed for the rolling/bucketing
sink. The core checkpoint functionality does not need any truncate()
behavior...
Best,
Stephan
On Tue, Oct 11, 2016 at 5:22 PM, Vijay Srinivasaraghavan <
vijikar...@yahoo.com.invalid> wrote:
> Thanks Stephan. My unders
Thanks Stephan. My understanding is checkpoint uses truncate API but S3A does
not support it. Will this have any impact?
Some of the known S3A client limitations are captured in Hortonworks site
https://hortonworks.github.io/hdp-aws/s3-s3aclient/index.html and wondering if
that has any impact on
Hi!
In 1.2-SNAPSHOT, we recently fixed issues due to the "eventual consistency"
nature of S3. The fix is not in v1.1 - that is the only known issue I can
think of.
It results in occasional (seldom) periods of heavy restart retries, until
all files are visible to all participants.
If you run into
Hello,
Per documentation
(https://ci.apache.org/projects/flink/flink-docs-master/setup/aws.html), it
looks like S3/S3A FS implementation is supported using standard Hadoop S3 FS
client APIs.
In the absence of using standard HCFS and going with S3/S3A,
1) Are there any known limitations/issues?