junaiddshaukat commented on issue #18479: URL: https://github.com/apache/beam/issues/18479#issuecomment-3875801814
Hi @je-ik, I've put together an initial draft of the design document for the portable Kafka Streams runner: https://docs.google.com/document/d/1BBMURhSG4SxPcvvnKMTrmnKCr_jhXL6R4TBDBW7zsy8/edit?tab=t.0 It covers the high-level architecture, pipeline translation approach (following the Flink portable runner pattern), transform mappings for Read, ParDo, GBK, CBK, Window, and Flatten, along with sections on watermark management, bundle handling, and state management. I've included a few architecture diagrams and listed open questions at the end — particularly around watermark derivation from Kafka consumer positions, repartitioning strategy for GBK, and bundle-commit alignment for exactly-once semantics. Looking forward to your feedback so we can iterate on this before sharing it on the dev@ list. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
