Shekharrajak commented on issue #18439:
URL: https://github.com/apache/druid/issues/18439#issuecomment-4246116345

   Clarifying my previous points : 
   
   Reading from kafka will be "at least once" but writting to druid deep 
storage will be "exactly once". 
   I have initial MVP version here https://github.com/apache/druid/pull/19311 
   We can have local kafka 4 running , producing the message like 
   ```
   
{"__time":"2025-06-01T00:00:00.000Z","item":"widget_a","value":100,"category":"electronics"}
   
{"__time":"2025-06-01T01:00:00.000Z","item":"widget_b","value":250,"category":"clothing"}
   ```
   
   and local druid UI we can have ingestion : 
   
   ```
   {
     "type": "index_kafka_share_group",
     "dataSchema": {
       "dataSource": "share_group_demo",
       "timestampSpec": {
         "column": "__time",
         "format": "auto"
       },
       "dimensionsSpec": {
         "useSchemaDiscovery": true
       },
       "granularitySpec": {
         "segmentGranularity": "DAY",
         "queryGranularity": "NONE"
       }
     },
     "ioConfig": {
       "type": "kafka_share_group",
       "topic": "druid-share-test",
       "groupId": "druid-demo-share-group",
       "consumerProperties": {
         "bootstrap.servers": "localhost:9092"
       },
       "inputFormat": {
         "type": "json"
       },
       "pollTimeout": 2000
     },
     "tuningConfig": {
       "type": "KafkaTuningConfig",
       "maxRowsPerSegment": 5000000
     }
   }
   ````


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to