Shekharrajak commented on issue #18439: URL: https://github.com/apache/druid/issues/18439#issuecomment-4246116345
Clarifying my previous points : Reading from kafka will be "at least once" but writting to druid deep storage will be "exactly once". I have initial MVP version here https://github.com/apache/druid/pull/19311 We can have local kafka 4 running , producing the message like ``` {"__time":"2025-06-01T00:00:00.000Z","item":"widget_a","value":100,"category":"electronics"} {"__time":"2025-06-01T01:00:00.000Z","item":"widget_b","value":250,"category":"clothing"} ``` and local druid UI we can have ingestion : ``` { "type": "index_kafka_share_group", "dataSchema": { "dataSource": "share_group_demo", "timestampSpec": { "column": "__time", "format": "auto" }, "dimensionsSpec": { "useSchemaDiscovery": true }, "granularitySpec": { "segmentGranularity": "DAY", "queryGranularity": "NONE" } }, "ioConfig": { "type": "kafka_share_group", "topic": "druid-share-test", "groupId": "druid-demo-share-group", "consumerProperties": { "bootstrap.servers": "localhost:9092" }, "inputFormat": { "type": "json" }, "pollTimeout": 2000 }, "tuningConfig": { "type": "KafkaTuningConfig", "maxRowsPerSegment": 5000000 } } ```` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
