Hi Dominik,

the big conceptual difference between DataStream and Table API is that record timestamps are part of the schema in Table API whereas they are attached internally to each record in DataStream API. When you call `y.rowtime` during a stream to table conversion, the runtime will extract the internal timestamp and will copy it into the field `y`.

Even if the timestamp is not internally anymore, Flink makes sure that the watermarking (which still happens internally) remains valid. However, this means that timestamps and watermarks must already be correct when entering the Table API. If they were not correct before, they will also not trigger time-based operations correctly.

If there is no output for a parallelism > 1, usually this means that one source parition has not emitted a watermark to have progress globally for the job:

watermark of operator = min(previous operator partition 1, previous operator partition 2, ...)

I hope this helps.

Regards,
Timo


On 19.03.20 16:38, Dominik Wosiński wrote:
I have created a simple minimal reproducible example that shows what I am talking about:
https://github.com/DomWos/FlinkTTF/tree/sql-ttf

It contains a test that shows that even if the output is in order which is enforced by multiple sleeps, then for parallelism > 1 there is no output and for parallelism == 1, the output is produced normally.

Best Regards,
Dom.

Reply via email to