atharvalade opened a new pull request, #3103:
URL: https://github.com/apache/iggy/pull/3103

   ## Which issue does this PR close?
   
   Closes #2956 
   
   ## Rationale
   
   Iggy lacks a native way to write stream messages to Amazon S3 and 
S3-compatible stores (MinIO, Cloudflare R2, Backblaze B2, DigitalOcean Spaces). 
This is a frequently requested capability for data lake ingestion and long-term 
archival pipelines.
   
   ## What changed?
   
   There was no connector for persisting Iggy messages to object storage. Users 
had to build custom consumers and upload logic to get data into S3.
   
   This PR adds a new `iggy_connector_s3_sink` crate that implements the `Sink` 
trait. It buffers messages in-memory per stream/topic/partition, rotates files 
by size or message count, renders S3 keys from a configurable path template 
(`{stream}/{topic}/{date}/{hour}/...`), and uploads with retry + exponential 
backoff. Supports `json_lines`, `json_array`, and `raw` output formats with 
optional Iggy metadata and header embedding. Uses `rust-s3` (already in 
workspace) with path-style addressing auto-enabled for custom endpoints.
   
   Key implementation details:
   - **6 source modules**: `lib.rs` (config + entry point), `client.rs` (S3 
client init + bucket verification), `buffer.rs` (in-memory accumulation + 
rotation logic), `formatter.rs` (JSON/raw output + metadata/header inclusion), 
`path.rs` (template engine for S3 keys with offset-based filenames), `sink.rs` 
(Sink trait: open/consume/close lifecycle)
   - **36 unit tests** covering config deserialization, buffer rotation, path 
template rendering, all output formats, credential validation, and edge cases
   - **CI integration**: added to `_build_rust_artifacts.yml` and 
`edge-release.yml` for cdylib plugin builds and release notes
   - **Error handling**: warnings logged on invalid config fallbacks, explicit 
buffer management on upload failure, close() warns if S3 client was never 
initialized
   - **End-to-end tested** locally with MinIO in Docker, Iggy server, CLI 
producer, and connector runtime — verified messages flow from Iggy stream into 
S3 bucket as properly formatted JSON
   
   ## Local Execution
   
   - Passed
   - Pre-commit hooks ran
   - Full CI checklist passed locally:
     - `cargo fmt --check` -- pass
     - `cargo clippy --tests -D warnings` -- pass (zero warnings)
     - `cargo test -p iggy_connector_s3_sink` -- 36/36 pass
     - `markdownlint --check` -- pass
     - `trailing-whitespace` -- pass
     - `trailing-newline` -- pass
     - `license-headers` -- pass
   
   ## AI Usage
   
   1. Opus 4.6
   2. used for scaffolding boilerplate and initial file structure, all logic 
was reviewed and iterated manually
   3. Verified through full local compilation, 36 unit tests, clippy with `-D 
warnings`, and end-to-end testing with MinIO Docker + Iggy server + CLI 
producer + connector runtime
   4. Yes
   
   ---
   
   Here are all the relevant screenshots:
   
   - MinIO Docker container running and accessible at localhost:9000
   - MinIO web console showing the created `iggy-test` bucket
   - Iggy server started with root credentials configured
   - Iggy CLI creating stream `application_logs` and topic `api_requests`
   - Iggy CLI sending test messages to the topic
   - Connector runtime loading the S3 sink plugin and connecting to MinIO
   - Connector runtime consuming messages and uploading to S3
   - MinIO console showing the uploaded `.jsonl` file in the correct path 
structure (`application_logs/api_requests/{date}/{hour}/`)
   - Contents of the uploaded file showing properly formatted JSON lines with 
metadata (offset, timestamp, stream, topic, partition_id, payload)
   - All 36 unit tests passing
   - `cargo clippy --tests -D warnings` passing with zero warnings
   
   
   <img width="1495" height="888" alt="Screenshot 2026-04-13 at 1 37 24 AM" 
src="https://github.com/user-attachments/assets/0fe39260-48e7-43ee-b093-888a2bfe186c";
 />
   <img width="808" height="254" alt="Screenshot 2026-04-13 at 1 36 38 AM" 
src="https://github.com/user-attachments/assets/8c5e5123-49ee-4308-bb75-5d433d6b421a";
 />
   <img width="813" height="211" alt="Screenshot 2026-04-13 at 1 36 30 AM" 
src="https://github.com/user-attachments/assets/e1d012a5-3f4b-4b4a-8fa7-328c19af600e";
 />
   <img width="807" height="706" alt="Screenshot 2026-04-13 at 1 36 12 AM" 
src="https://github.com/user-attachments/assets/dea839f2-1a31-48c3-a0bd-4ed87187dfd4";
 />
   <img width="799" height="762" alt="Screenshot 2026-04-13 at 1 35 34 AM" 
src="https://github.com/user-attachments/assets/4d195c59-b120-4955-a5fc-5050ba231e37";
 />
   <img width="807" height="730" alt="Screenshot 2026-04-13 at 1 28 47 AM" 
src="https://github.com/user-attachments/assets/cfb3807c-d350-4298-b87c-4bb8e1f2c72a";
 />
   <img width="810" height="54" alt="Screenshot 2026-04-13 at 1 28 25 AM" 
src="https://github.com/user-attachments/assets/4da494df-aa21-48c5-9c1d-c7049f9c4627";
 />
   <img width="686" height="50" alt="Screenshot 2026-04-13 at 1 28 16 AM" 
src="https://github.com/user-attachments/assets/39691781-a5b0-406c-a011-45c78626e409";
 />
   <img width="771" height="42" alt="Screenshot 2026-04-13 at 1 28 06 AM" 
src="https://github.com/user-attachments/assets/b58b05e9-0838-457f-8912-4d78c2f2b8b3";
 />
   <img width="718" height="735" alt="Screenshot 2026-04-13 at 1 27 52 AM" 
src="https://github.com/user-attachments/assets/cd295af4-7fc9-4557-8eeb-a3d8ec08e0b0";
 />
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to