Hi, Guillermo.

Additional, if there is an 'ORDER BY' clause, a Sort node should theoretically 
appear in the execution plan. 

However, according to CALCITE-2798[1], it is possible for the inner 'ORDER BY' 
to be ignored during SQL parsing.




[1] https://issues.apache.org/jira/browse/CALCITE-2798




--

    Best!
    Xuyang




在 2024-11-17 19:54:55,"Feng Jin" <jinfeng1...@gmail.com> 写道:

Hi Guillermo


> "When is this ordering done, and until when?"

Assuming the current watermark is 10:00
1. Currently, data before 10:00 will be sorted.
2. If data after 10:00 arrives at this time, the record will be stored in the 
state, waiting for the watermark to become 10:01 before sorting and output.
3. If data from 10:09 arrives at this time, it will be discarded.


>. "How would adding an INTERVAL of 10 seconds affect this?"

You should set the watermark to be 10 seconds later than the event time. Refer 
to document [1] for setting this up.

[1] 
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#watermark


Best,
Feng






On Mon, Nov 11, 2024 at 7:18 PM Guillermo <konstt2...@gmail.com> wrote:



I am running several queries in FlinkSQL, and in a final step before inserting 
into Kafka, I perform an ORDER BY eventTime. When I look at the execution plan, 
I see Exchange(distribution=[single]). Does this mean that all the data is 
going to a single node and getting reordered there? I haven't been able to find 
specific documentation. In the Hive dialect, there are options like SORT BY and 
DISTRIBUTION BY, but is there no option in FlinkSQL to sort at the partition 
level?


On the other hand, I am using some OVER AGGREGATION functions like LAG. In 
these functions, the partitioning and ordering fields are specified. I 
partition by a field clientId and order by timestamp. In the source tables, I 
use WATERMARKING on the timestamp field (event time). My question is, when is 
this ordering done, and until when? When I define the WATERMARKING field, I’m 
not setting any INTERVAL to wait for old events. How would adding an INTERVAL 
of 10 seconds, for example, affect when/what triggers the ordering?

Reply via email to