[ https://issues.apache.org/jira/browse/FLINK-24229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17450520#comment-17450520 ]
Yuri Gusev edited comment on FLINK-24229 at 11/30/21, 4:22 PM: --------------------------------------------------------------- Hi [~CrynetLogistics] We are working on it, implementation is almost there, will submit a PR for review may be this/start of the next week. In our previous implementation we deduplicated entries during aggregation of a batch, but now we need to move this to the writer itself, because you are doing batching in the AsyncSinkWriter. I'm not sure I followed your answer on the fatal exception behaviour. What we would like to do is to allow user to define how to handle the failure (for example after all retries towards DynamoDB for the current batch, send the "poisonous record" to the DLQ, or drop it). This is how we achieved it in the old implementation: [WriteRequestFailureHandler.java|https://github.com/YuriGusev/flink/blob/FLINK-16504_dynamodb_connector_rebased/flink-connectors/flink-connector-dynamodb/src/main/java/org/apache/flink/streaming/connectors/dynamodb/WriteRequestFailureHandler.java], [DynamoDbSink.java|https://github.com/YuriGusev/flink/blob/8787c343b615602c989fa793e0f4687ef40e530c/flink-connectors/flink-connector-dynamodb/src/main/java/org/apache/flink/streaming/connectors/dynamodb/DynamoDbSink.java#L355] At the moment it seems [it will send the failed records back|https://github.com/apache/flink/blob/cb5034b6b8d1a601ae3bc34feb3314518e78aae3/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java#L304] to the queue and retry again, or on fatalException it will propagate the error and stop the application. was (Author: gusev): Hi [~CrynetLogistics] We are working on it, implementation is almost there, will submit a PR for review may be this/start of the next week. In our previous implementation we deduplicated entries during aggregation of a batch, but now we need to move this to the writer itself, because you are doing batching in the AsyncSinkWriter. I'm not sure I followed your answer on the fatal exception behaviour. What we would like to do is to allow user to define what to do in the end (for example after all retries towards DynamoDB for the current batch). This is how we achieved it in the old implementation: [WriteRequestFailureHandler.java|https://github.com/YuriGusev/flink/blob/FLINK-16504_dynamodb_connector_rebased/flink-connectors/flink-connector-dynamodb/src/main/java/org/apache/flink/streaming/connectors/dynamodb/WriteRequestFailureHandler.java], [DynamoDbSink.java|https://github.com/YuriGusev/flink/blob/8787c343b615602c989fa793e0f4687ef40e530c/flink-connectors/flink-connector-dynamodb/src/main/java/org/apache/flink/streaming/connectors/dynamodb/DynamoDbSink.java#L355] At the moment it seems [it will send the failed records back|https://github.com/apache/flink/blob/cb5034b6b8d1a601ae3bc34feb3314518e78aae3/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java#L304] to the queue and retry again, or on fatalException it will propagate the error and stop the application. > [FLIP-171] DynamoDB implementation of Async Sink > ------------------------------------------------ > > Key: FLINK-24229 > URL: https://issues.apache.org/jira/browse/FLINK-24229 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common > Reporter: Zichen Liu > Assignee: Zichen Liu > Priority: Major > Labels: pull-request-available > Fix For: 1.15.0 > > > h2. Motivation > *User stories:* > As a Flink user, I’d like to use DynamoDB as sink for my data pipeline. > *Scope:* > * Implement an asynchronous sink for DynamoDB by inheriting the > AsyncSinkBase class. The implementation can for now reside in its own module > in flink-connectors. > * Implement an asynchornous sink writer for DynamoDB by extending the > AsyncSinkWriter. The implementation must deal with failed requests and retry > them using the {{requeueFailedRequestEntry}} method. If possible, the > implementation should batch multiple requests (PutRecordsRequestEntry > objects) to Firehose for increased throughput. The implemented Sink Writer > will be used by the Sink class that will be created as part of this story. > * Java / code-level docs. > * End to end testing: add tests that hits a real AWS instance. (How to best > donate resources to the Flink project to allow this to happen?) > h2. References > More details to be found > [https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink] -- This message was sent by Atlassian Jira (v8.20.1#820001)