[ https://issues.apache.org/jira/browse/FLINK-35232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17841999#comment-17841999 ]
Galen Warren commented on FLINK-35232: -------------------------------------- [~xtsong] You helped with the original implementation of the GCS RecoverableWriter implementation. This proposal – to allow people to adjust the retry settings – makes sense to me, but I don't really know what guidance to give the people proposing it as far as potential next steps might be. How would something like this potentially move forward? Thanks. > Support for retry settings on GCS connector > ------------------------------------------- > > Key: FLINK-35232 > URL: https://issues.apache.org/jira/browse/FLINK-35232 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem > Affects Versions: 1.15.3, 1.16.2, 1.17.1, 1.19.0, 1.18.1 > Reporter: Vikas M > Assignee: Ravi Singh > Priority: Major > > https://issues.apache.org/jira/browse/FLINK-32877 is tracking ability to > specify transport options in GCS connector. While setting the params enabled > here reduced read timeouts, we still see 503 errors leading to Flink job > restarts. > Thus, in this ticket, we want to specify additional retry settings as noted > in > [https://cloud.google.com/storage/docs/retry-strategy#customize-retries.|https://cloud.google.com/storage/docs/retry-strategy#customize-retries] > We need > [these|https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#methods] > methods available for Flink users so that they can customize their > deployment. In particular next settings seems to be the minimum required to > adjust GCS timeout with Job's checkpoint config: > * > [maxAttempts|https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getMaxAttempts__] > * > [initialRpcTimeout|https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getInitialRpcTimeout__] > * > [rpcTimeoutMultiplier|https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getRpcTimeoutMultiplier__] > * > [maxRpcTimeout|https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getMaxRpcTimeout__] > * > [totalTimeout|https://cloud.google.com/java/docs/reference/gax/latest/com.google.api.gax.retrying.RetrySettings#com_google_api_gax_retrying_RetrySettings_getTotalTimeout__] > > Basically the proposal is to be able to tune the timeout via multiplier, > maxAttemts + totalTimeout mechanisms. > All of the config options should be optional and the default one should be > used in case some of configs are not provided. -- This message was sent by Atlassian Jira (v8.20.10#820010)