Thanks for the flip,This ability is needed in production.I would like to add a 
suggestion,The dynamic adjustment of these configuration parameters is also 
helpful.`execution.checkpointing.max-concurrent-checkpoints``execution.checkpointing.min-pause``execution.checkpointing.tolerable-failed-checkpoints`
 

Best
xingsuo-zbz 

At 2026-04-08 15:51:08, "熊饶饶" <[email protected]> wrote:
>Thanks for the flip. It is useful for users. I have only one question: JM 
>Memory Pressure Under High-Concurrency Sampling — Could It Cause OOM in 
>Large-Scale Jobs?
>
>> 2026年3月24日 16:29,Jiangang Liu <[email protected]> 写道:
>> 
>> Hi everyone,
>> 
>> I would like to start a discussion on FLIP-571: Support Dynamically
>> Updating Checkpoint Configuration at Runtime via REST API [1].
>> 
>> Currently, checkpoint configuration (checkpointInterval, checkpointTimeout)
>> is immutable after job submission. This creates significant operational
>> challenges for long-running streaming jobs:
>> 
>>   1. Cascading checkpoint failures cannot be resolved without restarting
>>   the
>>   job, causing data reprocessing delays.
>>   2. Near-complete checkpoints (e.g., 95% persisted) are entirely discarded
>>   on timeout — wasting all I/O work and potentially creating a failure
>>   loop for large-state jobs.
>>   3. Static configuration cannot adapt to variable workloads at runtime.
>> 
>> FLIP-571 proposes a new REST API endpoint:
>> 
>> PATCH /jobs/:jobid/checkpoints/configuration
>> 
>> Key design points:
>> 
>>   - Timeout changes apply immediately to in-flight checkpoints by
>>   rescheduling their canceller timers, saving near-complete checkpoints
>>   from being discarded.
>>   - Interval changes take effect on the next checkpoint trigger cycle.
>>   - Configuration overrides are persisted to ExecutionPlanStore (following
>>   the JobResourceRequirements pattern) and automatically restored after
>>   failover.
>> 
>> For more details, please refer to the FLIP [1].
>> 
>> Looking forward to your feedback and suggestions!
>> 
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-571%3A+Support+Dynamically+Updating+Checkpoint+Configuration+at+Runtime+via+REST+API
>> 
>> Best regards,
>> Jiangang Liu

Reply via email to