ion created FLINK-39898:
---------------------------

             Summary: ProcessingTimeService timer callbacks delayed until 
checkpoint when mini-batch and unaligned checkpoints are enabled
                 Key: FLINK-39898
                 URL: https://issues.apache.org/jira/browse/FLINK-39898
             Project: Flink
          Issue Type: Bug
    Affects Versions: 2.1.2
            Reporter: ion


  h3. Description

  When both {{table.exec.mini-batch.enabled = true}} and 
{{execution.checkpointing.unaligned.enabled = true}} are set, timer callbacks 
registered via {{ProcessingTimeService.registerTimer()}} are not executed 
between checkpoints. They are only processed during checkpoint barrier handling.

  This is a regression from Flink 1.20, introduced by the urgent mail system in 
Flink 2.1 (FLINK-35796). Unaligned checkpoint barriers are submitted as urgent 
mails, which causes non-urgent mails (including ProcessingTimeService timer 
callbacks) to be skipped by
  {{TaskMailboxImpl.tryTakeFromBatch()}}.

  h3. Reproduction

  * Flink 2.1.2
  * {{table.exec.mini-batch.enabled = true}}
  * {{execution.checkpointing.unaligned.enabled = true}}
  * {{execution.checkpointing.interval = 3m}}
  * Any operator that uses {{ProcessingTimeService}} timers (e.g., upsert-kafka 
sink with {{sink.buffer-flush.interval > 0}})

  *Expected:* Timer callbacks fire at the registered interval.
  *Actual:* Timer callbacks only fire at checkpoint time.

  h3. Test Results

  ||Configuration||Timer works between checkpoints||
  |Flink 1.20 + unaligned ON|(/) works|
  |Flink 2.1 + unaligned OFF|(/) works|
  |Flink 2.1 + unaligned ON|(x) broken|

  h3. Workaround

  Disable either mini-batch or unaligned checkpoints.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to