[ https://issues.apache.org/jira/browse/FLINK-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15683385#comment-15683385 ]
ASF GitHub Bot commented on FLINK-5107: --------------------------------------- Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/2837#discussion_r88879086 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionVertex.java --- @@ -125,7 +142,7 @@ public ExecutionVertex( this.inputEdges = new ExecutionEdge[jobVertex.getJobVertex().getInputs().size()][]; - this.priorExecutions = new CopyOnWriteArrayList<Execution>(); + this.priorExecutions = new EvictingBoundedList<>(maxPriorExecutionHistoryLength); --- End diff -- we should either have a proper default value here or modify the SubtaskExecutionAttempt*Handler classes to be able to deal with cases where the `ExecutionVertex#getPriorExecutionAttempt()` returns null. > Job Manager goes out of memory from long history of prior execution attempts > ---------------------------------------------------------------------------- > > Key: FLINK-5107 > URL: https://issues.apache.org/jira/browse/FLINK-5107 > Project: Flink > Issue Type: Bug > Components: JobManager > Reporter: Stefan Richter > Assignee: Stefan Richter > > We have observed that the job manager can run out of memory during long > running jobs with many vertexes. Analysis of the heap dump shows, that the > ever-growing history of prior execution attempts is the culprit for this > problem. > We should limit this history to a number of n most recent attempts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)