[ 
https://issues.apache.org/jira/browse/FLINK-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642115#comment-14642115
 ] 

ASF GitHub Bot commented on FLINK-2406:
---------------------------------------

GitHub user StephanEwen opened a pull request:

    https://github.com/apache/flink/pull/938

    [FLINK-2406] [FLINK-2402] Abstract the BarrierBuffers and add a "at least 
once" version (BarrierTracker)

    Currently, the checkpointing mechanism provides "exactly once" guarantees. 
Part of that is the step that temporarily "aligns" the data streams, performed 
by the `BarrierBuffer`. This step increases the tuple latency temporarily.
    
    By offering a version that does not provide "exactly-once", but only 
"at-least-once", we can avoid the latency increase. This is relevant for 
super-low-latency applications (< 10 ms) that tolerate duplicates.
    
    This pull request does:
    
      - Add an interface for the functionality of the barrier buffer, to allow 
adding different implementations
      - Add the `BarrierTracker` which only tracks checkpoint barriers, without 
aligning the streams.
      - Add broader tests for the BarrierBuffer, including trailing data and 
barrier races.
      - Checkpoint barriers are handled by the buffer directly, rather than 
being returned and re-injected.
      - Simplify logic in the BarrierBuffer and fix certain corner cases.
      - Give access to spill directories properly via I/O manager, rather than 
via GlobalConfiguration singleton.
      - Rename the "BarrierBufferIOTest" to "BarrierBufferMassiveRandomTest"
      - A lot of code style/robustness fixes (proplery define constants, 
visibility, exception signatures)
    
    @gyfora  and @uce : The reworked variant of the BarrierBuffer eventually 
returns `null` when all buffers and events have been consumed. The previous 
impementation repeatedly returned the `EndOfPartitionEvent`. That seemed 
inconsistent with the implementation of the "vanilla" InputGate, which 
eventually returns `null` when all partitions have finished.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink latency

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/938.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #938
    
----
commit 9d9e34e85c812eedb7b0f6365952c853fc531627
Author: Stephan Ewen <se...@apache.org>
Date:   2015-07-12T17:33:38Z

    [tests] Add a manual test to evaluate impact of checkpointing on latency

commit 4cafad08a1b64bcaf0c95fe0b211eb740f6774b2
Author: Stephan Ewen <se...@apache.org>
Date:   2015-07-26T16:58:37Z

    [FLINK-2406] [streaming] Abstract and improve stream alignment via the 
BarrierBuffer
    
     - Add an interface for the functionaliy of the barrier buffer (for later 
addition of other implementatiions)
     - Add broader tests for the BarrierBuffer, inluding trailing data and 
barrier races.
     - Checkpoint barriers are handled by the buffer directly, rather than 
being returned and re-injected.
     - Simplify logic in the BarrierBuffer and fix certain corner cases.
     - Give access to spill directories properly via I/O manager, rather than 
via GlobalConfiguration singleton.
     - Rename the "BarrierBufferIOTest" to "BarrierBufferMassiveRandomTest"
     - A lot of code style/robustness fixes (proplery define constants, 
visibility, exception signatures)

commit ba9e5a3cc1aacb5078fdc50fee12f60ccefd2563
Author: Stephan Ewen <se...@apache.org>
Date:   2015-07-26T17:05:30Z

    [FLINK-2402] [streaming] Add a stream checkpoint barrier tracker.
    
    The BarrierTracker is non-blocking and only counts barriers.
    That way, it does not increase latency of records in the stream, but can 
only be
    used to obtain "at least once" processing guarantees.

----


> Abstract BarrierBuffer to an exchangeable BarrierHandler
> --------------------------------------------------------
>
>                 Key: FLINK-2406
>                 URL: https://issues.apache.org/jira/browse/FLINK-2406
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Streaming
>    Affects Versions: 0.10
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>             Fix For: 0.10
>
>
> We need to make the Checkpoint handling pluggable, to allow us to use 
> different implementations:
>   - BarrierBuffer for "exactly once" processing. This inevitably introduces a 
> bit of latency.
>   - BarrierTracker for "at least once" processing, with no added latency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to