subject:"Shared Checkpoint Cleanup and S3 Lifecycle Policy"

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

2020-05-08 Thread Congxian Qiu

Hi Currently, it is hard to determine which files can be deleted safely in the shared folder, the ground truth is in the checkpoint metafile. I've created an issue[1] for such a feature [1] https://issues.apache.org/jira/browse/FLINK-17571 Best, Congxian Trystan 于2020年5月8日周五下午1:05写道： > Aha,

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

2020-05-07 Thread Trystan

Aha, so incremental checkpointing *does* rely on infinitely-previous checkpoint state, regardless of the incremental retention number. The documentation wasn't entirely clear about this. One would assume that if you retain 3 checkpoints, anything older than the 3rd is irrelevant, but that's evident

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

2020-05-07 Thread Congxian Qiu

Hi Yes, there should only files used in checkpoint 8 and 9 and 10 in the checkpoint file, but you can not delete the file which created older than 3 minutes(because checkpoint 8,9, 10 may reuse the file created in the previous checkpoint, this is the how incremental checkpoint works[1]) you can a

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

2020-05-06 Thread Trystan

Thanks Congxian! To make sure I'm understanding correctly, if I retain 3 incremental checkpoints (say every minute), and I've just completed checkpoint 10, then anything in shared is from checkpoint 8 and 9 only. So anything older than ~3 minutes can safely be deleted? The state from checkpoint 5 d

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

2020-05-06 Thread Congxian Qiu

Hi For the rate limit, could you please try entropy injection[1]. For checkpoint, Flink will handle the file lifecycle(it will delete the file if it will never be used in the future). The file in the checkpoint will be there if the corresponding checkpoint is still valid. [1] https://ci.apache.org

Shared Checkpoint Cleanup and S3 Lifecycle Policy

2020-05-06 Thread Trystan

Hello! Recently we ran into an issue when checkpointing to S3. Because S3 ratelimits based on prefix, the /shared directory would get slammed and cause S3 throttling. There is no solution for this, because /job/checkpoint/:id/shared is all part of the prefix, and is limited to 3,500 PUT/COPY/POST/

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

Re: Shared Checkpoint Cleanup and S3 Lifecycle Policy

Shared Checkpoint Cleanup and S3 Lifecycle Policy

6 matches

Site Navigation

Mail list logo

Footer information