[ https://issues.apache.org/jira/browse/FLINK-9598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16571678#comment-16571678 ]
ASF GitHub Bot commented on FLINK-9598: --------------------------------------- zentol commented on issue #6346: [FLINK-9598] Refine java-doc about the min pause between checkpoints URL: https://github.com/apache/flink/pull/6346#issuecomment-411062550 After looking at the discussion threasd I'm not sure if it makes sense to merge this PR. If the behavior is deemed buggy we shouldn't touch the javadocs and fix the behavior instead. One could argue that they should still outline the _current_ state, but then end up switching back-and-forth between versions. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Checkpoints] The config Minimum Pause Between Checkpoints doesn't work when > there's a checkpoint failure > --------------------------------------------------------------------------------------------------------- > > Key: FLINK-9598 > URL: https://issues.apache.org/jira/browse/FLINK-9598 > Project: Flink > Issue Type: Bug > Affects Versions: 1.3.2 > Reporter: Prem Santosh > Assignee: Yun Tang > Priority: Major > Labels: pull-request-available > Attachments: Screen Shot 2018-06-20 at 7.44.10 AM.png > > > We have set the config Minimum Pause Between Checkpoints to be 10 min but > noticed that when a checkpoint fails (because it timesout before it > completes) the application immediately starts taking the next checkpoint. > This basically stalls the application's progress since its always taking > checkpoints. > [^Screen Shot 2018-06-20 at 7.44.10 AM.png] is a screenshot of this issue. > Details: > * Running Flink-1.3.2 on EMR > * checkpoint timeout duration: 40 min > * minimum pause between checkpoints: 10 min > There is also a [relevant > thread|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Having-a-backoff-while-experiencing-checkpointing-failures-td20618.html] > that I found on the Flink users group. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)