I am running several queries in FlinkSQL, and in a final step before
inserting into Kafka, I perform an ORDER BY eventTime. When I look at the
execution plan, I see Exchange(distribution=[single]). Does this mean that
all the data is going to a single node and getting reordered there? I
haven't been able to find specific documentation. In the Hive dialect,
there are options like SORT BY and DISTRIBUTION BY, but is there no option
in FlinkSQL to sort at the partition level?

On the other hand, I am using some OVER AGGREGATION functions like LAG. In
these functions, the partitioning and ordering fields are specified. I
partition by a field clientId and order by timestamp. In the source tables,
I use WATERMARKING on the timestamp field (event time). My question is,
when is this ordering done, and until when? When I define the WATERMARKING
field, I’m not setting any INTERVAL to wait for old events. How would
adding an INTERVAL of 10 seconds, for example, affect when/what triggers
the ordering?

Reply via email to