Roman Khachatryan created FLINK-26062: -----------------------------------------
Summary: [Changelog] Non-deterministic recovery of PriorityQueue states Key: FLINK-26062 URL: https://issues.apache.org/jira/browse/FLINK-26062 Project: Flink Issue Type: Bug Components: Runtime / State Backends Affects Versions: 1.15.0 Reporter: Roman Khachatryan Assignee: Roman Khachatryan Fix For: 1.15.0 Currently, InternalPriorityQueue.poll() is logged as a separate operation, without specifying the element that has been polled. On recovery, this recorded poll() is replayed. However, this is not deterministic because the order of PQ elements with equal priorityis not specified. For example, TimerHeapInternalTimer only compares timestamps, which are often equal. This results in polling timers from queue in wrong order => dropping timers => and not firing timers. ProcessingTimeWindowCheckpointingITCase.testAggregatingSlidingProcessingTimeWindow fails with materialization enabled and using heap state backend (both in-memory and fs-based implementations). Proposed solution is to replace poll with remove operation (which is based on equality). cc: [~masteryhx], [~ym], [~yunta] -- This message was sent by Atlassian Jira (v8.20.1#820001)