Hi Guillermo

> "When is this ordering done, and until when?"

Assuming the current watermark is 10:00
1. Currently, data before 10:00 will be sorted.
2. If data after 10:00 arrives at this time, the record will be stored in
the state, waiting for the watermark to become 10:01 before sorting and
output.
3. If data from 10:09 arrives at this time, it will be discarded.


>. "How would adding an INTERVAL of 10 seconds affect this?"

You should set the watermark to be 10 seconds later than the event time.
Refer to document [1] for setting this up.

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#watermark

Best,
Feng



On Mon, Nov 11, 2024 at 7:18 PM Guillermo <konstt2...@gmail.com> wrote:

>
> I am running several queries in FlinkSQL, and in a final step before
> inserting into Kafka, I perform an ORDER BY eventTime. When I look at the
> execution plan, I see Exchange(distribution=[single]). Does this mean
> that all the data is going to a single node and getting reordered there? I
> haven't been able to find specific documentation. In the Hive dialect,
> there are options like SORT BY and DISTRIBUTION BY, but is there no
> option in FlinkSQL to sort at the partition level?
>
> On the other hand, I am using some OVER AGGREGATION functions like LAG. In
> these functions, the partitioning and ordering fields are specified. I
> partition by a field clientId and order by timestamp. In the source
> tables, I use WATERMARKING on the timestamp field (event time). My
> question is, when is this ordering done, and until when? When I define the
> WATERMARKING field, I’m not setting any INTERVAL to wait for old events.
> How would adding an INTERVAL of 10 seconds, for example, affect when/what
> triggers the ordering?
>

Reply via email to