[ 
https://issues.apache.org/jira/browse/FLINK-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643785#comment-15643785
 ] 

ASF GitHub Bot commented on FLINK-5021:
---------------------------------------

GitHub user kl0u opened a pull request:

    https://github.com/apache/flink/pull/2763

    [FLINK-5021] Makes the ContinuousFileReaderOperator rescalable.

     This is the last PR that completes the refactoring of the 
`ContinuousFileReaderOperator` so that it can be rescalable. With this, the 
reader can restart from a savepoint with a different parallelism without 
compromising the provided exactly-once guarantees.
    
    The whole PR contains 3 commits. 
    
    The first removes the EOS special split which was used to signal that no 
new splits are to be processed. This was useful in the `PROCESS_ONCE` mode. Now 
the reader closes by setting a flag and waiting for all the pending splits to 
be fully processed.
    
    The second puts an additional check in the 
`ContinuousFileMonitoringFunction` that guarantees that in the case of the 
`PROCESS_ONCE`, the source will not reprocess the directory after recovering 
from a failure.
    
    Finally, the third integrates the new rescalable state abstractions with 
the reader so that the reader can restart from a savepoint with different 
parallelism and still guarantee exactly-once semantics.
    
    R: @aljoscha 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kl0u/flink repart_fs

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2763.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2763
    
----
commit 3976c70449f958da24ded2f7e5c3c1a3ef6cccd7
Author: kl0u <[email protected]>
Date:   2016-11-03T09:50:04Z

    [FLINK-5021] Removes the special EOS TimestampedFileInputSplit.
    
    Without this special split signaling that no more splits are
    to arrive, the ContinuousFileReaderOperator now closes by
    setting a flag that marks it as closed and exiting when the
    flag is set to true and the pending split queue is empty.

commit 18ebac80f34c8c5db88d8493c21a3ba2e9fa2c6c
Author: kl0u <[email protected]>
Date:   2016-11-03T10:08:45Z

    [FLINK-5021] Guarantees PROCESS_ONCE works correctly after recovering.

commit bbabe5e95aa6f115772099af250744a066eccb77
Author: kl0u <[email protected]>
Date:   2016-11-03T10:21:08Z

    [FLINK-5021] Makes the ContinuousFileReaderOperator rescalable.
    
    This is the last commit that completes the refactoring of the
    ContinuousFileReaderOperator so that it can be rescalable.
    With this, the reader can restart from a savepoint with a
    different parallelism without compromising the provided
    exactly-once guarantees.

----


> Makes the ContinuousFileReaderOperator rescalable.
> --------------------------------------------------
>
>                 Key: FLINK-5021
>                 URL: https://issues.apache.org/jira/browse/FLINK-5021
>             Project: Flink
>          Issue Type: Improvement
>          Components: filesystem-connector
>            Reporter: Kostas Kloudas
>            Assignee: Kostas Kloudas
>             Fix For: 1.2.0
>
>
> This targets integrating the ContinuousFileReaderOperator with the new 
> rescalable state abstractions, so that the operator can change parallelism 
> without jeopardizing the guarantees offered by it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to