Hi Ravi, Please checkout [1] and [2]. That is related to Kafka but probably applies to Kinesis as well. If one stream is empty, there is no way for Flink to know about the watermark of that stream and Flink can't advance the watermark. Following downstream operators can thus not know if there will be any more data coming from the empty stream. (Think about a source which is just offline or has network issues for some time and once back online, will deliver all old data). This leads to Flink being unable to commit the final result up until there is some data coming in from the empty stream.
Best regards Theo [1] [ https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-and-timestamp-extractionwatermark-emission | https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-and-timestamp-extractionwatermark-emission ] [2] https://issues.apache.org/jira/browse/FLINK-5479 Von: "Ravi Bhushan Ratnakar" <ravibhushanratna...@gmail.com> An: "user" <user@flink.apache.org> Gesendet: Samstag, 3. August 2019 19:23:25 Betreff: StreamingFileSink not committing file to S3 Hi All, I am designing a streaming pipeline using Flink 1.8.1, which consumes messages from Kinesis and apply some business logic on per key basis using KeyedProcessFunction and Checkpointing(HeapStateBackend). It is consuming messages around 7GB per minutes from multiple Kinesis streams. I am using only one Kinesis Source which is configured with multiple streams. The pipeline processes data and writes output to s3 as expected but I am experiencing a very weird issue when one of the stream is completely empty then it doesn't flush any file to s3 however it is consuming data from rest of the streams. When i remove only this empty stream and again submit the job then everything works fine and it writes output to s3. Regards, Ravi