[jira] [Commented] (FLINK-7637) FlinkKinesisProducer violates at-least-once guarantees

ASF GitHub Bot (JIRA) Mon, 23 Oct 2017 19:40:09 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216206#comment-16216206
 ]


ASF GitHub Bot commented on FLINK-7637:
---------------------------------------

Github user bowenli86 commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4871#discussion_r146440154
  
    --- Diff: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisProducer.java
 ---
    @@ -265,19 +259,86 @@ public void close() throws Exception {
                if (kp != null) {
                        LOG.info("Flushing outstanding {} records", 
kp.getOutstandingRecordsCount());
                        // try to flush all outstanding records
    -                   while (kp.getOutstandingRecordsCount() > 0) {
    -                           kp.flush();
    -                           try {
    -                                   Thread.sleep(500);
    -                           } catch (InterruptedException e) {
    -                                   LOG.warn("Flushing was interrupted.");
    -                                   // stop the blocking flushing and 
destroy producer immediately
    -                                   break;
    -                           }
    -                   }
    +                   flushSync(kp);
    +
                        LOG.info("Flushing done. Destroying producer 
instance.");
                        kp.destroy();
                }
    +
    +           // make sure we propagate pending errors
    +           checkAndPropagateAsyncError();
        }
     
    +   @Override
    +   public void initializeState(FunctionInitializationContext context) 
throws Exception {
    +           // nothing to do
    +   }
    +
    +   @Override
    +   public void snapshotState(FunctionSnapshotContext context) throws 
Exception {
    +           // check for asynchronous errors and fail the checkpoint if 
necessary
    +           checkAndPropagateAsyncError();
    +
    +           flushSync(producer);
    +           if (producer.getOutstandingRecordsCount() > 0) {
    --- End diff --
    
    what if records are added by another thread between the calls of 
`flushSync()` and `producer.getOutstandingRecordsCount()`?


> FlinkKinesisProducer violates at-least-once guarantees
> ------------------------------------------------------
>
>                 Key: FLINK-7637
>                 URL: https://issues.apache.org/jira/browse/FLINK-7637
>             Project: Flink
>          Issue Type: Bug
>          Components: Kinesis Connector
>            Reporter: Tzu-Li (Gordon) Tai
>            Assignee: Tzu-Li (Gordon) Tai
>            Priority: Blocker
>             Fix For: 1.4.0, 1.3.3
>
>
> Currently, there is no flushing of KPL outstanding records on checkpoints in 
> the {{FlinkKinesisProducer}}. Likewise to the at-least-once issue on the 
> Flink Kafka producer before, this may lead to data loss if there are 
> asynchronous failing records after a checkpoint which the records was part of 
> was completed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7637) FlinkKinesisProducer violates at-least-once guarantees

Reply via email to