退订










At 2024-01-08 17:45:01, "Piotr Nowojski" <pnowoj...@apache.org> wrote:
>Hi thanks for the responses,
>
>And thanks for pointing out the jobs upgrade issue. Indeed that has
>slipped my mind. I was mistakenly
>thinking that we are supporting all upgrades only via savepoint. Anyway,
>maybe in that case we should
>guide users towards that? Using savepoints for upgrades? That would be even
>easier to understand
>for the users:
>- use unaligned checkpoints for checkpoints
>- use savepoints for any changes in the job/version upgrades
>
>There is a downside, that savepoints are always full, while aligned
>checkpoints can be incremental.
>
>WDYT?
>
>Regarding the value for the timeout, I would also be fine with 30s. Indeed
>that's a safer default.
>
>> On a separate point, in the sentence below it seems to me it would be
>> clearer to say that in the unlikely scenario you've described, the change
>> would "significantly increase checkpoint sizes" -- assuming I understand
>> things correctly.
>
>I've reworded that paragraph.
>
>Best,
>Piotrek
>
>
>
>pon., 8 sty 2024 o 08:02 Rui Fan <1996fan...@gmail.com> napisał(a):
>
>> Thanks to Piotr driving this proposal!
>>
>> Enabling unaligned checkpoint with aligned checkpoints timeout
>> is fine for me. I'm not sure if aligned checkpoints timeout =5s is
>> too aggressive. If the unaligned checkpoint is enabled by default
>> for all jobs, I recommend that the aligned checkpoints timeout be
>> at least 30s.
>>
>> If the 30s is too big for some of the flink jobs, flink users can turn
>> it down by themselves.
>>
>> To David, Ken and Zhanghao:
>>
>> Unaligned checkpoint indeed has some limitations than aligned checkpoint,
>> but if we set aligned checkpoints timeout= 30s or 60s, it means
>> when a job can be completed within 30s or 60s, this job still uses the
>> aligned checkpoint (it doesn't introduce any extra effort).
>> When the checkpoint cannot be completed within aligned checkpoints timeout,
>> the aligned checkpoint will be switched to the unaligned checkpoint
>> The unaligned checkpoint can be completed when backpressure is severe.
>>
>> In brief, when backpressure is low, enabling them without any effort.
>> when backpressure is high, enabling them has some benefits.
>>
>> So I think it doesn't have too many risks when aligned checkpoints timeout
>> is set to 30s or above. WDYT?
>>
>> Best,
>> Rui
>>
>> On Mon, Jan 8, 2024 at 12:57 PM Zhanghao Chen <zhanghao.c...@outlook.com>
>> wrote:
>>
>> > Hi Piotr,
>> >
>> > As a platform administer who runs kilos of Flink jobs, I'd be against the
>> > idea to enable unaligned cp by default for our jobs. It may help a
>> > significant portion of the users, but the subtle issues around unaligned
>> CP
>> > for a few jobs will probably raise a lot more on-calls and incidents.
>> From
>> > my point of view, we'd better not enable it by default before removing
>> all
>> > the limitations listed in
>> >
>> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/ops/state/checkpointing_under_backpressure/#limitations
>> > .
>> >
>> > Best,
>> > Zhanghao Chen
>> > ________________________________
>> > From: Piotr Nowojski <pnowoj...@apache.org>
>> > Sent: Friday, January 5, 2024 21:41
>> > To: dev <dev@flink.apache.org>
>> > Subject: FLIP-413: Enable unaligned checkpoints by default
>> >
>> > Hi!
>> >
>> > I would like to propose by default to enable unaligned checkpoints and
>> also
>> > simultaneously increase the aligned checkpoints timeout from 0ms to 5s. I
>> > think this change is the right one to do for the majority of Flink users.
>> >
>> > For more rationale please take a look into the short FLIP-413 [1].
>> >
>> > What do you all think?
>> >
>> > Best,
>> > Piotrek
>> >
>> >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-413%3A+Enable+unaligned+checkpoints+by+default
>> >
>>

Reply via email to