[ 
https://issues.apache.org/jira/browse/FLINK-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212403#comment-16212403
 ] 

ASF GitHub Bot commented on FLINK-7637:
---------------------------------------

GitHub user tzulitai opened a pull request:

    https://github.com/apache/flink/pull/4871

     [FLINK-7637] [kinesis] Fix at-least-once guarantee in FlinkKinesisProducer

    ## What is the purpose of the change
    
    Prior to this PR, there is no flushing of KPL outstanding records on 
checkpoints in the `FlinkKinesisProducer`. Likewise to the at-least-once issue 
on the Flink Kafka producer before, this may lead to data loss if there are 
asynchronous failing records after a checkpoint which the records was part of 
was completed.
    
    ## Brief change log
    
    - Fix at-least-once in the Kinesis producer by properly flushing on 
checkpoints.
    - Minor fixes (last 2 commits) that cleans up the code.
    
    ## Verifying this change
    
    New unit tests are added to `FlinkKinesisProducerTest` to verify 
at-least-once.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): no
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
      - The serializers: no
      - The runtime per-record code paths (performance sensitive): no
      - Anything that affects deployment or recovery: no
    
    ## Documentation
    
      - Does this pull request introduce a new feature? no
      - If yes, how is the feature documented? n/a
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tzulitai/flink FLINK-7637

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4871.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4871
    
----
commit 3cbb6a9fd35437cc913e7535b5de1d4a6fb2d746
Author: Tzu-Li (Gordon) Tai <tzuli...@apache.org>
Date:   2017-10-20T09:05:25Z

    [FLINK-7637] [kinesis] Fix at-least-once guarantee in FlinkKinesisProducer
    
    Prior to this commit, there is no flushing of KPL outstanding records on
    checkpoints in the FlinkKinesisProducer. Likewise to the at-least-once
    issue on the Flink Kafka producer before, this may lead to data loss if
    there are asynchronous failing records after a checkpoint which the
    records was part of was completed.

commit a2c3019087ea277d93eb835a1f70dc6e345e4133
Author: Tzu-Li (Gordon) Tai <tzuli...@apache.org>
Date:   2017-10-20T09:06:36Z

    [hotfix] [kinesis] Fix inproper test name in FlinkKinesisProducerTest

commit d547dd7d690edb6ec058dfbcbdcb77b4b8727c95
Author: Tzu-Li (Gordon) Tai <tzuli...@apache.org>
Date:   2017-10-20T09:11:00Z

    [hotfix] [kinesis] Properly add serialVersionUIDs to FlinkKinesisProducer 
classes

----


> FlinkKinesisProducer violates at-least-once guarantees
> ------------------------------------------------------
>
>                 Key: FLINK-7637
>                 URL: https://issues.apache.org/jira/browse/FLINK-7637
>             Project: Flink
>          Issue Type: Bug
>          Components: Kinesis Connector
>            Reporter: Tzu-Li (Gordon) Tai
>            Assignee: Tzu-Li (Gordon) Tai
>            Priority: Blocker
>             Fix For: 1.4.0, 1.3.3
>
>
> Currently, there is no flushing of KPL outstanding records on checkpoints in 
> the {{FlinkKinesisProducer}}. Likewise to the at-least-once issue on the 
> Flink Kafka producer before, this may lead to data loss if there are 
> asynchronous failing records after a checkpoint which the records was part of 
> was completed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to