We have been looking for a while for some way to decouple the S3 filesystem
support from Hadoop.
Does anyone know a good S3 connector library that works independent of
Hadoop and EMRFS?
Best,
Stephan
On Wed, Nov 23, 2016 at 7:57 PM, Greg Hogan wrote:
> EMRFS looks to *add* cost (and consisten
EMRFS looks to *add* cost (and consistency).
Storing an object to S3 costs "$0.005 per 1,000 requests", so $0.432/day at
1 Hz. Is the number of checkpoint files simply parallelism * number of
operators? That could add up quickly.
Is the recommendation to run HDFS on EBS?
On Wed, Nov 23, 2016 at
ot;user@flink.apache.org"
> *Date: *Wednesday, November 23, 2016 at 8:24 AM
> *To: *"user@flink.apache.org"
> *Subject: *Re: S3 checkpointing in AWS in Frankfurt
>
>
>
> Hi Jonathan,
>
>
>
> have you tried using Amazon's latest EMR Hadoop distri
Hi Scott,
Thanks for the suggestion, it sounds like you and I think alike, going over
to hdfs sounds to me like the simplest solution.
There are no requirements to use S3, just another team member who is
generally sceptical fearing that adding HDFS will add a new class of
maintenance problems to
Hi Greg,
Standard storage class, everything is on defaults, we've not done anything
special with the bucket.
Cloud Watch only appears to give me total billing for S3 in general, I
don't see a breakdown unless that's something I can configure somewhere.
Regards,
Jonathan
On 23 November 2016 at
;user@flink.apache.org"
Date: Wednesday, November 23, 2016 at 8:24 AM
To: "user@flink.apache.org"
Subject: Re: S3 checkpointing in AWS in Frankfurt
Hi Jonathan,
have you tried using Amazon's latest EMR Hadoop distribution? Maybe they've
fixed the issue in their for older
Hi Jonathan,
have you tried using Amazon's latest EMR Hadoop distribution? Maybe they've
fixed the issue in their for older Hadoop releases?
On Wed, Nov 23, 2016 at 4:38 PM, Scott Kidder
wrote:
> Hi Jonathan,
>
> You might be better off creating a small Hadoop HDFS cluster just for the
> purpos
Hi Jonathan,
You might be better off creating a small Hadoop HDFS cluster just for the
purpose of storing Flink checkpoint & savepoint data. Like you, I tried
using S3 to persist Flink state, but encountered AWS SDK issues and felt
like I was going down an ill-advised path. I then created a small
Hi Jonathan,
Which S3 storage class are you using? Do you have a breakdown of the S3
costs as storage / API calls / early deletes / data transfer?
Greg
On Wed, Nov 23, 2016 at 2:52 AM, Jonathan Share wrote:
> Hi,
>
> I'm interested in hearing if anyone else has experience with using Amazon
> S
Hi,
I'm interested in hearing if anyone else has experience with using Amazon
S3 as a state backend in the Frankfurt region. For political reasons we've
been asked to keep all European data in Amazon's Frankfurt region. This
causes a problem as the S3 endpoint in Frankfurt requires the use of AWS
10 matches
Mail list logo