There Is a Delay While Over Aggregation Sending Results

wang guanglei Thu, 10 Feb 2022 22:05:37 -0800

Hey Flink Community,

I am using FlinkSQL Over Aggregation 
<https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/queries/over-agg/>
 to calculate the number of uuid per client ip during the past 1 hour.
The flink sql I am using is something like below:
SELECT
COUNT(DISTINCT consumer_consumerUuid) OVER w AS feature_value,
clientIp as              entity_id
FROM wide_table
WINDOW w AS (
PARTITION BY clientIp
ORDER BY ts
RANGE BETWEEN INTERVAL '1' HOUR PRECEDING AND CURRENT ROW
)
From the documentation, we know that the OVER aggregates produce an aggregated 
value for every input row, which means (in my view) the calculation is 
triggered by every input event in wide_table not by watermark?
However, seeing from my logs, there is always about a 5-60 seconds' delay 
between the input row and the result calculated by window.


The data volume is small, there are only about 1k records/hour in table 
wide_table and less than 10 consumer for each clientIp.

Is it normal with this delay? Or there is something wrong with the way it is 
used ?

Thanks.

There Is a Delay While Over Aggregation Sending Results

Reply via email to