[jira] [Commented] (FLINK-37319) Add retry in RocksDBStateUploader for fault tolerant

Zhenqiu Huang (Jira) Thu, 13 Feb 2025 14:43:12 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-37319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17926984#comment-17926984
 ]


Zhenqiu Huang commented on FLINK-37319:
---------------------------------------

For the Flink applications that use cloud storage as state backend, the 
checkpoint failure could happen due to the object store DR (Some times it is 
pretty often with 90 seconds timeout). This, it will be great to add additional 
layer of retry in the RocksDBStateUploader to handle with the transient failure.

> Add retry in RocksDBStateUploader for fault tolerant
> ----------------------------------------------------
>
>                 Key: FLINK-37319
>                 URL: https://issues.apache.org/jira/browse/FLINK-37319
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Zhenqiu Huang
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (FLINK-37319) Add retry in RocksDBStateUploader for fault tolerant

Reply via email to