Dear Apache Flink Community,


I hope this message finds you well. We are currently exploring the option
of utilizing Amazon S3 as a checkpoint storage solution alongside our
Apache Flink server. As part of this effort, we understand that AWS S3
access must be configured properly, and checkpoints need to be externalized
and retained on S3.



In the Apache Flink documentation, I found the following information about
externalized checkpoints:

"Externalized Checkpoints – Normally, checkpoints are not intended to be
manipulated by users. Flink retains only the n-most-recent checkpoints (n
being configurable) while a job is running and deletes them when a job is
cancelled. However, you can configure them to be retained, allowing manual
resumption from them."



Given this context, I would appreciate the community's input on the
following query:

Is it officially supported in a production setup to use the Flink run API
to resume a job from an externalized checkpoint stored on AWS S3?
Specifically, is it possible to invoke the REST API endpoint
"/jars/:jarid/run" and provide 'savepointPath' as a checkpoint directory on
S3 storage, such as "s3://<path_to_checkpoint>"?

Your insights and experiences would be invaluable as we evaluate this
approach. Thank you for your assistance and support.

Reply via email to