I am using s3 as checkpoint storage for Flink running as part of EMR (EC2)
+ YARN setup and also running on EKS.
There should not be any problem with it.

Thanks
Sachin


On Thu, Apr 24, 2025 at 12:09 PM Anuj Jain <anuj...@gmail.com> wrote:

> Dear Apache Flink Community,
>
>
>
> I hope this message finds you well. We are currently exploring the option
> of utilizing Amazon S3 as a checkpoint storage solution alongside our
> Apache Flink server. As part of this effort, we understand that AWS S3
> access must be configured properly, and checkpoints need to be externalized
> and retained on S3.
>
>
>
> In the Apache Flink documentation, I found the following information about
> externalized checkpoints:
>
> "Externalized Checkpoints – Normally, checkpoints are not intended to be
> manipulated by users. Flink retains only the n-most-recent checkpoints (n
> being configurable) while a job is running and deletes them when a job is
> cancelled. However, you can configure them to be retained, allowing manual
> resumption from them."
>
>
>
> Given this context, I would appreciate the community's input on the
> following query:
>
> Is it officially supported in a production setup to use the Flink run API
> to resume a job from an externalized checkpoint stored on AWS S3?
> Specifically, is it possible to invoke the REST API endpoint
> "/jars/:jarid/run" and provide 'savepointPath' as a checkpoint directory on
> S3 storage, such as "s3://<path_to_checkpoint>"?
>
> Your insights and experiences would be invaluable as we evaluate this
> approach. Thank you for your assistance and support.
>
>
>

Reply via email to