[ 
https://issues.apache.org/jira/browse/FLINK-33897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17803407#comment-17803407
 ] 

Zakelly Lan commented on FLINK-33897:
-------------------------------------

[~pnowojski]  Actually there is real world motivation. When a job encountered 
high back-pressure and after dozens of minutes of aligned checkpointing without 
success, the user finds that they need to switch to unaligned cp or enlarge the 
parallelism. Such change requires a job restart, which puts users in a dilemma 
because this involves replaying much data and a longer delay. This feature 
allows users to make an unaligned cp temporarily and restart from it, 
preventing from the large data replay.

I do agree we could enable timeout for aligned cp by default, which greatly 
reduce this case. And I also think there would be value giving user a chance to 
change the configuration and restart the job with less pain when they 
misconfigured their jobs, by supporting triggering a swift and promising 
checkpoint or savepoint. As for the complication supporting this feature, IIUC, 
some changes should apply to the handler states (may introduce a new 
{{{}BarrierHandlerState{}}}) and less change will make to the 
{{SingleCheckpointBarrierHandler}} itself. I'm not very familiar with this part 
so if you think this is a big change, I won't insist on doing it.

> Allow triggering unaligned checkpoint via CLI
> ---------------------------------------------
>
>                 Key: FLINK-33897
>                 URL: https://issues.apache.org/jira/browse/FLINK-33897
>             Project: Flink
>          Issue Type: New Feature
>          Components: Command Line Client, Runtime / Checkpointing
>            Reporter: Zakelly Lan
>            Assignee: Zakelly Lan
>            Priority: Major
>
> After FLINK-6755, user could trigger checkpoint through CLI. However I 
> noticed there would be value supporting trigger it in unaligned way, since 
> the job may encounter a high back-pressure and an aligned checkpoint would 
> fail.
>  
> I suggest we provide an option '-unaligned' in CLI to support that.
>  
> Similar option would also be useful for REST api



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to