[ https://issues.apache.org/jira/browse/FLINK-4582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667569#comment-16667569 ]
Ying Xu commented on FLINK-4582: -------------------------------- Thanks [~tinder-dthomson] for the detailed comments. Yes that's exactly why I felt _efficient multi-stream_ support is somehow lacking :). Actually, we are running Flink 1.5.2 internally. For contributing to upstream, I'm currently adapting the patch to fit the master flink (1.7-SNAPSNOT). The main difference is flink 1.7 Kinesis connector uses the _listshards API_ to retrieve the shard list. For DynamoDB streams, we must use the _describeStreams API_ to retrieve such information since listshards is not supported. I am currently porting related logic around _describeStreams_ from the 1.5 flink to my patch. I shall be able to post a meaningful PR in 1-2 days. > Allow FlinkKinesisConsumer to adapt for AWS DynamoDB Streams > ------------------------------------------------------------ > > Key: FLINK-4582 > URL: https://issues.apache.org/jira/browse/FLINK-4582 > Project: Flink > Issue Type: New Feature > Components: Kinesis Connector, Streaming Connectors > Reporter: Tzu-Li (Gordon) Tai > Assignee: Ying Xu > Priority: Major > > AWS DynamoDB is a NoSQL database service that has a CDC-like (change data > capture) feature called DynamoDB Streams > (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.html), > which is a stream feed of item-level table activities. > The DynamoDB Streams shard abstraction follows that of Kinesis Streams with > only a slight difference in resharding behaviours, so it is possible to build > on the internals of our Flink Kinesis Consumer for an exactly-once DynamoDB > Streams source. > I propose an API something like this: > {code} > DataStream dynamoItemsCdc = > FlinkKinesisConsumer.asDynamoDBStream(tableNames, schema, config) > {code} > The feature adds more connectivity to popular AWS services for Flink, and > combining what Flink has for exactly-once semantics, out-of-core state > backends, and queryable state with CDC can have very strong use cases. For > this feature there should only be an extra dependency to the AWS Java SDK for > DynamoDB, which has Apache License 2.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)