Edward Rojas created FLINK-11337:
------------------------------------

             Summary: Incorrect watermark in StreaminFileSink 
BucketAssigner.Context when used in connected stream
                 Key: FLINK-11337
                 URL: https://issues.apache.org/jira/browse/FLINK-11337
             Project: Flink
          Issue Type: Bug
          Components: filesystem-connector
    Affects Versions: 1.7.0
            Reporter: Edward Rojas


When StreamingFileSink is used as sink of a connected stream the "invoke" 
method of the sink could be called before the "combinedWatermark" is updated 
with the timestamp of the element currently being processed, resulting on the 
context containing the incorrect watermark value (the Long.MIN_VALUE when using 
"AssignerWithPeriodicWatermarks" for the firsts events in the stream). 

I reproduce this when using a broadcast stream connected to a data stream. The 
broadcast stream is using a custom timestamp extractor that always return the 
Watermark.MAX_VALUE as it's done in a trining example here: 
[https://github.com/dataArtisans/flink-training-exercises/blob/master/src/main/java/com/dataartisans/flinktraining/solutions/datastream_java/broadcast/OngoingRidesSolution.java#L143.]

This is problematic as the watermark could not be used reliably to compute the 
bucket id based on event time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to