Hi Greg,

Standard storage class, everything is on defaults, we've not done anything
special with the bucket.

Cloud Watch only appears to give me total billing for S3 in general, I
don't see a breakdown unless that's something I can configure somewhere.

Regards,
Jonathan


On 23 November 2016 at 16:29, Greg Hogan <c...@greghogan.com> wrote:

> Hi Jonathan,
>
> Which S3 storage class are you using? Do you have a breakdown of the S3
> costs as storage / API calls / early deletes / data transfer?
>
> Greg
>
> On Wed, Nov 23, 2016 at 2:52 AM, Jonathan Share <jon.sh...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I'm interested in hearing if anyone else has experience with using Amazon
>> S3 as a state backend in the Frankfurt region. For political reasons we've
>> been asked to keep all European data in Amazon's Frankfurt region. This
>> causes a problem as the S3 endpoint in Frankfurt requires the use of AWS
>> Signature Version 4 "This new Region supports only Signature Version 4"
>> [1] and this doesn't appear to work with the Hadoop version that Flink is
>> built against [2].
>>
>> After some hacking we have managed to create a docker image with a build
>> of Flink 1.2 master, copying over jar files from the hadoop
>> 3.0.0-alpha1 package and this appears to work, for the most part but we
>> still suffer from some classpath problems (conflicts between AWS API used
>> in hadoop and those we want to use in out streams for interacting with
>> Kinesis) and the whole thing feels a little fragile. Has anyone else tried
>> this? Is there a simpler solution?
>>
>> As a follow-up question, we saw that with checkpointing on three
>> relatively simple streams set to 1 second, our S3 costs were higher than
>> the EC2 costs for our entire infrastructure. This seems slightly
>> disproportionate. For now we have reduced checkpointing interval to 10
>> seconds and that has greatly improved the cost projections graphed via
>> Amazon Cloud Watch, but I'm interested in hearing other peoples experience
>> with this. Is that the kind of billing level we can expect or is this a
>> symptom of a mis-configuration? Is this a setup others are using? As we are
>> using Kinesis as the source for all streams I don't see a huge risk with
>> larger checkpoint intervals and our Sinks are designed to mostly tolerate
>> duplicates (some improvements can be made).
>>
>> Thanks in advance
>> Jonathan
>>
>>
>> [1] https://aws.amazon.com/blogs/aws/aws-region-germany/
>> [2] https://issues.apache.org/jira/browse/HADOOP-13324
>>
>
>

Reply via email to