[ 
https://issues.apache.org/jira/browse/FLINK-30544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lijie Wang updated FLINK-30544:
-------------------------------
    Description: 
Currently, every time a task receives a watermark, it tries to update the 
minimum watermark.Currently, we use the traversal algorithm to find the minimum 
watermark across all channels(see 
[StatusWatermarkValue#findAndOutputNewMinWatermarkAcrossAlignedChannels|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/watermarkstatus/StatusWatermarkValve.java#:~:text=private%20void-,findAndOutputNewMinWatermarkAcrossAlignedChannels,-(DataOutput%3C%3F%3E]
 for details), and the time complexity is O(N), where N is the number of 
channels.

We can optimize it by introducing a heap-based algorthim, reducing the time 
complexity to O(log(N))

  was:
Currently, every time a task receives a watermark, it tries to update the 
minimum watermark.Currently, we use the traversal algorithm to find the minimum 
watermark across all channels(see 
[StatusWatermarkValue#findAndOutputNewMinWatermarkAcrossAlignedChannels|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/watermarkstatus/StatusWatermarkValve.java#:~:text=private%20void-,findAndOutputNewMinWatermarkAcrossAlignedChannels,-(DataOutput%3C%3F%3E]
 for details), and the time complexity is O(N), where N is the number of 
channels.

We can optimize it by introducing a heap-based algorthim, reducing the time 
complexity to O(log(N)))


> Speed up finding minimum watermark across all channels by introducing 
> heap-based algorithm
> ------------------------------------------------------------------------------------------
>
>                 Key: FLINK-30544
>                 URL: https://issues.apache.org/jira/browse/FLINK-30544
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Task
>            Reporter: Lijie Wang
>            Assignee: Lijie Wang
>            Priority: Major
>             Fix For: 1.17.0
>
>
> Currently, every time a task receives a watermark, it tries to update the 
> minimum watermark.Currently, we use the traversal algorithm to find the 
> minimum watermark across all channels(see 
> [StatusWatermarkValue#findAndOutputNewMinWatermarkAcrossAlignedChannels|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/watermarkstatus/StatusWatermarkValve.java#:~:text=private%20void-,findAndOutputNewMinWatermarkAcrossAlignedChannels,-(DataOutput%3C%3F%3E]
>  for details), and the time complexity is O(N), where N is the number of 
> channels.
> We can optimize it by introducing a heap-based algorthim, reducing the time 
> complexity to O(log(N))



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to