Hello, I have successfully been able to store data on S3 bucket. Earlier, I used to have a similar issue. What you need to confirm: 1. S3 bucket is created with RW access(irrespective if it is minio or AWS S3) 2. "flink/opt/flink-s3-fs-presto-1.14.0.jar" jar is copied to plugin directory of "flink/plugins/s3-fs-presto" 3. Add following configuration in config(configuration or programmatically, either way)
state.checkpoints.dir: <S3://bucket-name/checkpoints> state.backend.fs.checkpointdir: <s3://bucket-name/checkpoints/> s3.path-style: true s3.path.style.access: true On Wed, Oct 27, 2021 at 2:47 AM Vamshi G <vgandr...@salesforce.com> wrote: > s3a with hadoop s3 filesystem works fine for us wit sts assume role > credentials and with kms. > Below are how our hadoop s3a configs look like. Since the endpoint is > globally whitelisted, we don't explicitly mention the endpoint. > > fs.s3a.aws.credentials.provider: > org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider > fs.s3a.assumed.role.credentials.provider: > com.amazonaws.auth.profile.ProfileCredentialsProvider > fs.s3a.assumed.role.arn: arn:aws:iam::<account>:role/<iam_role> > fs.s3a.server-side-encryption-algorithm: SSE-KMS > fs.s3a.server-side-encryption.key: > arn:aws:kms:<region>:<account>:key/<key-alias> > > > However, for checkpointing we definitely want to use presto s3, and just > could not make it work. FINE logging on presto-hive is not helping either, > as the lib uses airlift logger. > Also, based on the code here > https://github.com/prestodb/presto/blob/2aeedb944fc8b47bfe1cad78732d6dd2308ee9ad/presto-hive/src/main/java/com/facebook/presto/hive/s3/PrestoS3FileSystem.java#L821, > PrestoS3FileSystem does switch to iam role credentials if one is provided. > > Anyone successful using the s3 presto filesystem in flink v1.13.0? > > > Thanks, > Vamshi > > > On Mon, Aug 16, 2021 at 3:59 AM David Morávek <d...@apache.org> wrote: > >> Hi Vamshi, >> >> From your configuration I'm guessing that you're using Amazon S3 (not any >> implementation such as Minio). >> >> Two comments: >> - *s3.endpoint* should not contain bucket (this is included in your s3 >> path, eg. *s3://<bucket>/<file>*) >> - "*s3.path.style.access*: true" is only correct for 3rd party >> implementation such as Minio / Swift, that have bucket definied in url path >> instead of subdomain >> >> You can find some information about connecting to s3 in Flink docs [1]. >> >> [1] >> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/s3/ >> <https://urldefense.com/v3/__https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/filesystems/s3/__;!!DCbAVzZNrAf4!RfOKZc2kW2eWOFMP6fnvNYnG0F8tq8oaCr08o2xPNF7G1L2OfoLZdZifyODfHBc3Nx4$> >> >> Best, >> D. >> >> >> On Tue, Aug 10, 2021 at 2:37 AM Vamshi G <vgandr...@salesforce.com> >> wrote: >> >>> We are using Flink version 1.13.0 on Kubernetes. >>> For checkpointing we have configured fs.s3 flink-s3-fs-presto. >>> We have enabled sse on our buckets with kms cmk. >>> >>> flink-conf.yaml is configured as below. >>> s3.entropy.key: _entropy_ >>> s3.entropy.length: 4 >>> s3.path.style.access: true >>> s3.ssl.enabled: true >>> s3.sse.enabled: true >>> s3.sse.type: KMS >>> s3.sse.kms-key-id: <ARN of keyid> >>> s3.iam-role: <IAM role with read/write access to bucket> >>> s3.endpoint: <bucketname>.s3-us-west-2.amazonaws.com >>> <https://urldefense.com/v3/__http://s3-us-west-2.amazonaws.com__;!!DCbAVzZNrAf4!RfOKZc2kW2eWOFMP6fnvNYnG0F8tq8oaCr08o2xPNF7G1L2OfoLZdZifyODfwoagq5A$> >>> s3.credentials-provider: >>> com.amazonaws.auth.profile.ProfileCredentialsProvider >>> >>> However, PUT operations on the bucket are resulting in access denied >>> error. Access policies for the role are checked and works fine when checked >>> with CLI. >>> Also, can't get to see debug logs from presto s3 lib, is there a way to >>> enable logger for presto airlift logging? >>> >>> Any inputs on above issue? >>> >>> -- Regards, Parag Surajmal Somani.