[ https://issues.apache.org/jira/browse/FLINK-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642115#comment-14642115 ]
ASF GitHub Bot commented on FLINK-2406: --------------------------------------- GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/938 [FLINK-2406] [FLINK-2402] Abstract the BarrierBuffers and add a "at least once" version (BarrierTracker) Currently, the checkpointing mechanism provides "exactly once" guarantees. Part of that is the step that temporarily "aligns" the data streams, performed by the `BarrierBuffer`. This step increases the tuple latency temporarily. By offering a version that does not provide "exactly-once", but only "at-least-once", we can avoid the latency increase. This is relevant for super-low-latency applications (< 10 ms) that tolerate duplicates. This pull request does: - Add an interface for the functionality of the barrier buffer, to allow adding different implementations - Add the `BarrierTracker` which only tracks checkpoint barriers, without aligning the streams. - Add broader tests for the BarrierBuffer, including trailing data and barrier races. - Checkpoint barriers are handled by the buffer directly, rather than being returned and re-injected. - Simplify logic in the BarrierBuffer and fix certain corner cases. - Give access to spill directories properly via I/O manager, rather than via GlobalConfiguration singleton. - Rename the "BarrierBufferIOTest" to "BarrierBufferMassiveRandomTest" - A lot of code style/robustness fixes (proplery define constants, visibility, exception signatures) @gyfora and @uce : The reworked variant of the BarrierBuffer eventually returns `null` when all buffers and events have been consumed. The previous impementation repeatedly returned the `EndOfPartitionEvent`. That seemed inconsistent with the implementation of the "vanilla" InputGate, which eventually returns `null` when all partitions have finished. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink latency Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/938.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #938 ---- commit 9d9e34e85c812eedb7b0f6365952c853fc531627 Author: Stephan Ewen <se...@apache.org> Date: 2015-07-12T17:33:38Z [tests] Add a manual test to evaluate impact of checkpointing on latency commit 4cafad08a1b64bcaf0c95fe0b211eb740f6774b2 Author: Stephan Ewen <se...@apache.org> Date: 2015-07-26T16:58:37Z [FLINK-2406] [streaming] Abstract and improve stream alignment via the BarrierBuffer - Add an interface for the functionaliy of the barrier buffer (for later addition of other implementatiions) - Add broader tests for the BarrierBuffer, inluding trailing data and barrier races. - Checkpoint barriers are handled by the buffer directly, rather than being returned and re-injected. - Simplify logic in the BarrierBuffer and fix certain corner cases. - Give access to spill directories properly via I/O manager, rather than via GlobalConfiguration singleton. - Rename the "BarrierBufferIOTest" to "BarrierBufferMassiveRandomTest" - A lot of code style/robustness fixes (proplery define constants, visibility, exception signatures) commit ba9e5a3cc1aacb5078fdc50fee12f60ccefd2563 Author: Stephan Ewen <se...@apache.org> Date: 2015-07-26T17:05:30Z [FLINK-2402] [streaming] Add a stream checkpoint barrier tracker. The BarrierTracker is non-blocking and only counts barriers. That way, it does not increase latency of records in the stream, but can only be used to obtain "at least once" processing guarantees. ---- > Abstract BarrierBuffer to an exchangeable BarrierHandler > -------------------------------------------------------- > > Key: FLINK-2406 > URL: https://issues.apache.org/jira/browse/FLINK-2406 > Project: Flink > Issue Type: Sub-task > Components: Streaming > Affects Versions: 0.10 > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Fix For: 0.10 > > > We need to make the Checkpoint handling pluggable, to allow us to use > different implementations: > - BarrierBuffer for "exactly once" processing. This inevitably introduces a > bit of latency. > - BarrierTracker for "at least once" processing, with no added latency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)