[jira] [Commented] (FLINK-10020) Kinesis Consumer listShards should support more recoverable exceptions

ASF GitHub Bot (JIRA) Sun, 05 Aug 2018 19:16:53 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-10020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569661#comment-16569661
 ]


ASF GitHub Bot commented on FLINK-10020:
----------------------------------------

tweise commented on a change in pull request #6482: [FLINK-10020] [kinesis] 
Support recoverable exceptions in listShards.
URL: https://github.com/apache/flink/pull/6482#discussion_r207761087
 
 

 ##########
 File path: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/proxy/KinesisProxy.java
 ##########
 @@ -433,6 +440,16 @@ private ListShardsResult listShards(String streamName, 
@Nullable String startSha
                        } catch (ExpiredNextTokenException expiredToken) {
                                LOG.warn("List Shards has an expired token. 
Reusing the previous state.");
                                break;
+                       } catch (SdkClientException ex) {
+                               if (isRecoverableSdkClientException(ex)) {
+                                       long backoffMillis = fullJitterBackoff(
+                                               listShardsBaseBackoffMillis, 
listShardsMaxBackoffMillis, listShardsExpConstant, attemptCount++);
+                                       LOG.warn("Got SdkClientException when 
listing shards from stream {}. Backing off for {} millis.",
+                                               streamName, backoffMillis);
+                                       Thread.sleep(backoffMillis);
 
 Review comment:
   Please see the JIRA for an example of such exception. These are really the 
same type of exceptions that we don't want getRecords to fail on and I believe 
we should be consistent with the backoff. Since listShards isn't latency 
sensitive it won't hurt to error on the conservative side.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kinesis Consumer listShards should support more recoverable exceptions
> ----------------------------------------------------------------------
>
>                 Key: FLINK-10020
>                 URL: https://issues.apache.org/jira/browse/FLINK-10020
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kinesis Connector
>            Reporter: Thomas Weise
>            Assignee: Thomas Weise
>            Priority: Major
>              Labels: pull-request-available
>
> Currently transient errors in listShards make the consumer fail and cause the 
> entire job to reset. That is unnecessary for certain exceptions (like status 
> 503 errors). It should be possible to control the exceptions that qualify for 
> retry, similar to getRecords/isRecoverableSdkClientException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (FLINK-10020) Kinesis Consumer listShards should support more recoverable exceptions

Reply via email to