Samrat002 commented on code in PR #27187:
URL: https://github.com/apache/flink/pull/27187#discussion_r2807755089


##########
flink-filesystems/flink-s3-fs-native/README.md:
##########
@@ -0,0 +1,355 @@
+# Native S3 FileSystem
+
+This module provides a native S3 filesystem implementation for Apache Flink 
using AWS SDK v2.
+
+## Overview
+
+The Native S3 FileSystem is a direct implementation of Flink's FileSystem 
interface using AWS SDK v2, without Hadoop dependencies. It provides 
exactly-once semantics for checkpointing and file sinks through S3 multipart 
uploads.
+
+## Supported URI Schemes
+
+This module supports both `s3://` and `s3a://` URI schemes:
+
+| Scheme | Description |
+|--------|-------------|
+| `s3://` | Primary scheme for native S3 filesystem |
+| `s3a://` | Hadoop S3A compatibility scheme - allows drop-in replacement for 
existing Hadoop-based configurations |
+
+Both schemes use the same native AWS SDK v2 implementation and share identical 
configuration options.
+
+**Example usage with either scheme:**
+
+```java
+// Using s3:// scheme
+env.getCheckpointConfig().setCheckpointStorage("s3://my-bucket/checkpoints");
+
+// Using s3a:// scheme (for Hadoop compatibility)
+env.getCheckpointConfig().setCheckpointStorage("s3a://my-bucket/checkpoints");
+```
+
+## Usage
+
+Add this module to Flink's plugins directory:
+
+```bash
+mkdir -p $FLINK_HOME/plugins/s3-fs-native
+cp flink-s3-fs-native-*.jar $FLINK_HOME/plugins/s3-fs-native/
+```
+
+Configure S3 credentials in `conf/config.yaml`:
+
+```yaml
+s3.access-key: YOUR_ACCESS_KEY
+s3.secret-key: YOUR_SECRET_KEY
+s3.endpoint: https://s3.amazonaws.com  # Optional, defaults to AWS
+```
+
+Use S3 paths in your Flink application:
+
+```java
+env.getCheckpointConfig().setCheckpointStorage("s3://my-bucket/checkpoints");
+
+DataStream<String> input = env.readTextFile("s3://my-bucket/input");
+input.sinkTo(FileSink.forRowFormat(new Path("s3://my-bucket/output"), 
+                                    new SimpleStringEncoder<>()).build());
+```
+
+## Configuration Options
+
+### Core Settings
+
+| Key | Default | Description |
+|-----|---------|-------------|
+| s3.access-key | (none) | AWS access key |
+| s3.secret-key | (none) | AWS secret key |
+| s3.region | (auto-detect) | AWS region (auto-detected via AWS_REGION, 
~/.aws/config, EC2 metadata) |
+| s3.endpoint | (none) | Custom S3 endpoint (for MinIO, LocalStack, etc.) |
+| s3.path-style-access | false | Use path-style access (auto-enabled for 
custom endpoints) |
+| s3.upload.min.part.size | 5242880 | Minimum part size for multipart uploads 
(5MB) |
+| s3.upload.max.concurrent.uploads | CPU cores | Maximum concurrent uploads 
per stream |
+| s3.entropy.key | (none) | Key for entropy injection in paths |
+| s3.entropy.length | 4 | Length of entropy string |
+| s3.bulk-copy.enabled | true | Enable bulk copy operations |
+| s3.async.enabled | true | Enable async read/write with TransferManager |
+| s3.read.buffer.size | 262144 (256KB) | Read buffer size per stream (64KB - 
4MB) |
+
+### Server-Side Encryption (SSE)
+
+| Key | Default | Description |
+|-----|---------|-------------|
+| s3.sse.type | none | Encryption type: `none`, `sse-s3` (AES256), `sse-kms` 
(AWS KMS) |
+| s3.sse.kms.key-id | (none) | KMS key ID/ARN/alias for SSE-KMS (uses default 
aws/s3 key if not specified) |
+
+### IAM Assume Role
+
+| Key | Default | Description |
+|-----|---------|-------------|
+| s3.assume-role.arn | (none) | ARN of the IAM role to assume |
+| s3.assume-role.external-id | (none) | External ID for cross-account access |
+| s3.assume-role.session-name | flink-s3-session | Session name for the 
assumed role |
+| s3.assume-role.session-duration | 3600 | Session duration in seconds 
(900-43200) |
+
+## Server-Side Encryption (SSE)
+
+The filesystem supports server-side encryption for data at rest:
+
+### SSE-S3 (S3-Managed Keys)
+
+Amazon S3 manages the encryption keys. Simplest option with no additional 
configuration.
+
+```yaml
+s3.sse.type: sse-s3
+```
+
+All objects will be encrypted with AES-256 using keys managed by S3.
+
+### SSE-KMS (AWS KMS-Managed Keys)
+
+Use AWS Key Management Service for encryption key management. Provides 
additional security features like key rotation, audit trails, and fine-grained 
access control.
+
+**Using the default aws/s3 key:**
+```yaml
+s3.sse.type: sse-kms
+```
+
+**Using a custom KMS key:**
+```yaml
+s3.sse.type: sse-kms
+s3.sse.kms.key-id: 
arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789abc
+# Or use an alias:
+# s3.sse.kms.key-id: alias/my-s3-encryption-key
+```
+
+**Note:** Ensure the IAM role/user has `kms:Encrypt` and `kms:GenerateDataKey` 
permissions on the KMS key.
+
+## IAM Assume Role
+
+For cross-account access or temporary elevated permissions, configure an IAM 
role to assume:
+
+### Basic Assume Role
+
+```yaml
+s3.assume-role.arn: arn:aws:iam::123456789012:role/S3AccessRole
+```
+
+### Cross-Account Access with External ID
+
+For enhanced security when granting access to third parties:

Review Comment:
   will pick it up as a followup task once this patch is merged 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to