I am running several queries in FlinkSQL, and in a final step before inserting into Kafka, I perform an ORDER BY eventTime. When I look at the execution plan, I see Exchange(distribution=[single]). Does this mean that all the data is going to a single node and getting reordered there? I haven't been able to find specific documentation. In the Hive dialect, there are options like SORT BY and DISTRIBUTION BY, but is there no option in FlinkSQL to sort at the partition level?
On the other hand, I am using some OVER AGGREGATION functions like LAG. In these functions, the partitioning and ordering fields are specified. I partition by a field clientId and order by timestamp. In the source tables, I use WATERMARKING on the timestamp field (event time). My question is, when is this ordering done, and until when? When I define the WATERMARKING field, I’m not setting any INTERVAL to wait for old events. How would adding an INTERVAL of 10 seconds, for example, affect when/what triggers the ordering?