[ 
https://issues.apache.org/jira/browse/FLINK-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15534216#comment-15534216
 ] 

ASF GitHub Bot commented on FLINK-4329:
---------------------------------------

Github user kl0u commented on the issue:

    https://github.com/apache/flink/pull/2546
  
    Hi @StephanEwen . If I understand correctly, your suggestion is to make the 
test something like the following: 1) put the split in the reader 2) read the 
split 3) when the split finishes update the time in the provider 4) observe the 
time in the output elements. If this is the case, then the problem is that the 
reader just puts the split in a queue, and this is picked up by another thread 
that reads it. In this context, there is no way of knowing when the reading 
thread has finished reading the split and goes to the next one. So step 3) 
cannot be synchronized correctly. This is the reason I am just having a thread 
in the test that tries (without guarantees - the race condition you mentioned) 
to update the time while the reader is still reading. Any suggestions are 
welcome.


> Fix Streaming File Source Timestamps/Watermarks Handling
> --------------------------------------------------------
>
>                 Key: FLINK-4329
>                 URL: https://issues.apache.org/jira/browse/FLINK-4329
>             Project: Flink
>          Issue Type: Bug
>          Components: Streaming Connectors
>    Affects Versions: 1.1.0
>            Reporter: Aljoscha Krettek
>            Assignee: Kostas Kloudas
>             Fix For: 1.2.0, 1.1.3
>
>
> The {{ContinuousFileReaderOperator}} does not correctly deal with watermarks, 
> i.e. they are just passed through. This means that when the 
> {{ContinuousFileMonitoringFunction}} closes and emits a {{Long.MAX_VALUE}} 
> that watermark can "overtake" the records that are to be emitted in the 
> {{ContinuousFileReaderOperator}}. Together with the new "allowed lateness" 
> setting in window operator this can lead to elements being dropped as late.
> Also, {{ContinuousFileReaderOperator}} does not correctly assign ingestion 
> timestamps since it is not technically a source but looks like one to the 
> user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to