[ https://issues.apache.org/jira/browse/FLINK-20615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17256444#comment-17256444 ]
Till Rohrmann commented on FLINK-20615: --------------------------------------- Looking at the log output {code} [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688) switched from DEPLOYING to RUNNING. 2020-12-29T21:11:45.6842078Z Dec 29 21:11:04 2020-12-29 21:10:56,757 DEBUG org.apache.flink.streaming.runtime.tasks.StreamTask [] - Initializing Source: Custom Source (3/4)#255. 2020-12-29T21:11:45.6843136Z Dec 29 21:11:04 2020-12-29 21:10:56,758 DEBUG org.apache.flink.streaming.runtime.tasks.StreamTask [] - Invoking Source: Custom Source (3/4)#255 2020-12-29T21:11:45.6844432Z Dec 29 21:11:04 2020-12-29 21:10:56,758 DEBUG org.apache.flink.runtime.state.TaskLocalStateStoreImpl [] - Did not find registered local state for checkpoint 3 in subtask (e22dc2b9ff59ed3b5f9f444745153baa - bc764cd8ddf7a0cff126f51c16239658 - 2) 2020-12-29T21:11:45.6849247Z Dec 29 21:11:04 2020-12-29 21:10:56,758 DEBUG org.apache.flink.runtime.state.TaskStateManagerImpl [] - Operator bc764cd8ddf7a0cff126f51c16239658 has remote state SubtaskState{operatorStateFromBackend=StateObjectCollection{[OperatorStateHandle{stateNameToPartitionOffsets={currentKey=StateMetaInfo{offsets=[192], distributionMode=SPLIT_DISTRIBUTE}}, delegateStateHandle=ByteStreamStateHandle{handleName='file:/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-33991570506/local_recovery_test/checkpoints/e22dc2b9ff59ed3b5f9f444745153baa/chk-3/0e4a0721-a9a5-41d8-b6e4-94023ec5a1c8', dataBytes=200}}]}, operatorStateFromStream=StateObjectCollection{[]}, keyedStateFromBackend=StateObjectCollection{[]}, keyedStateFromStream=StateObjectCollection{[]}, inputChannelState=StateObjectCollection{[]}, resultSubpartitionState=StateObjectCollection{[]}, stateSize=200} from job manager and local state alternatives [] from local state store TaskLocalStateStore{jobID=e22dc2b9ff59ed3b5f9f444745153baa, jobVertexID=bc764cd8ddf7a0cff126f51c16239658, allocationID=fef8e1740736efdffe751fdc602fd0e1, subtaskIndex=2, localRecoveryConfig=LocalRecoveryConfig{localRecoveryMode=true, localStateDirectories=LocalRecoveryDirectoryProvider{rootDirectories=[/tmp/localState/aid_fef8e1740736efdffe751fdc602fd0e1], jobID=e22dc2b9ff59ed3b5f9f444745153baa, jobVertexID=bc764cd8ddf7a0cff126f51c16239658, subtaskIndex=2}}, storedCheckpointIDs=[]}. 2020-12-29T21:11:45.6852863Z Dec 29 21:11:04 2020-12-29 21:10:56,758 DEBUG org.apache.flink.streaming.api.operators.BackendRestorerProcedure [] - Creating operator state backend for StreamSource_bc764cd8ddf7a0cff126f51c16239658_(3/4) and restoring with state from alternative (1/1). 2020-12-29T21:11:45.6854442Z Dec 29 21:11:04 2020-12-29 21:10:56,804 DEBUG org.apache.flink.streaming.runtime.tasks.StreamTask [] - Finished task Source: Custom Source (3/4)#255 2020-12-29T21:11:45.6855660Z Dec 29 21:11:04 2020-12-29 21:10:56,805 DEBUG org.apache.flink.streaming.runtime.tasks.StreamTask [] - Closed operators for task Source: Custom Source (3/4)#255 2020-12-29T21:11:45.6857718Z Dec 29 21:11:04 2020-12-29 21:10:56,805 DEBUG org.apache.flink.runtime.io.network.partition.PipelinedSubpartition [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688): Finished PipelinedSubpartition#0 [number of buffers: 2 (0 bytes), number of buffers in backlog: 1, finished? true, read view? false]. 2020-12-29T21:11:45.6868241Z Dec 29 21:11:04 2020-12-29 21:10:56,805 DEBUG org.apache.flink.runtime.io.network.partition.PipelinedSubpartition [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688): Finished PipelinedSubpartition#1 [number of buffers: 2 (0 bytes), number of buffers in backlog: 1, finished? true, read view? false]. 2020-12-29T21:11:45.6870179Z Dec 29 21:11:04 2020-12-29 21:10:56,805 DEBUG org.apache.flink.runtime.io.network.partition.PipelinedSubpartition [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688): Finished PipelinedSubpartition#2 [number of buffers: 2 (0 bytes), number of buffers in backlog: 1, finished? true, read view? false]. 2020-12-29T21:11:45.6872073Z Dec 29 21:11:04 2020-12-29 21:10:56,805 DEBUG org.apache.flink.runtime.io.network.partition.PipelinedSubpartition [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688): Finished PipelinedSubpartition#3 [number of buffers: 2 (0 bytes), number of buffers in backlog: 1, finished? true, read view? false]. 2020-12-29T21:11:45.6873629Z Dec 29 21:11:04 2020-12-29 21:10:56,805 INFO org.apache.flink.runtime.taskmanager.Task [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688) switched from RUNNING to FINISHED. 2020-12-29T21:11:45.6874935Z Dec 29 21:11:04 2020-12-29 21:10:56,805 INFO org.apache.flink.runtime.taskmanager.Task [] - Freeing task resources for Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688). 2020-12-29T21:11:45.6876227Z Dec 29 21:11:04 2020-12-29 21:10:56,805 DEBUG org.apache.flink.runtime.taskmanager.Task [] - Release task Source: Custom Source (3/4)#255 network resources (state: FINISHED). 2020-12-29T21:11:45.6877441Z Dec 29 21:11:04 2020-12-29 21:10:56,805 DEBUG org.apache.flink.runtime.io.network.TaskEventDispatcher [] - unregistering d39c5df41fb51c57ac0e340421e62374#2@90cbf40ffa5c2a89a8cc1a6c5ea9b688 2020-12-29T21:11:45.6878938Z Dec 29 21:11:04 2020-12-29 21:10:56,805 DEBUG org.apache.flink.runtime.taskmanager.Task [] - Ensuring all FileSystem streams are closed for task Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688) [FINISHED] 2020-12-29T21:11:45.6880460Z Dec 29 21:11:04 2020-12-29 21:10:56,805 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Un-registering task and sending final execution state FINISHED to JobManager for task Source: Custom Source (3/4)#255 90cbf40ffa5c2a89a8cc1a6c5ea9b688. 2020-12-29T21:11:45.6882952Z Dec 29 21:11:04 2020-12-29 21:10:56,811 INFO org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation [] - Starting to restore from state handle: KeyGroupsStateHandle{groupRangeOffsets=KeyGroupRangeOffsets{keyGroupRange=KeyGroupRange{startKeyGroup=2, endKeyGroup=2}, offsets=[371]}, stateHandle=RelativeFileStateHandle State: file:/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-33991570506/local_recovery_test/checkpoints/e22dc2b9ff59ed3b5f9f444745153baa/chk-3/d7cfa567-a7f6-42d4-8027-826d0d872f65, d7cfa567-a7f6-42d4-8027-826d0d872f65 [2232041 bytes]}. 2020-12-29T21:11:45.6884825Z Dec 29 21:11:04 2020-12-29 21:10:59,077 DEBUG org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Received heartbeat request from b21775cb8c75328e6f3c66b79630f66f. 2020-12-29T21:11:45.6886529Z Dec 29 21:11:04 2020-12-29 21:11:00,175 DEBUG org.apache.flink.runtime.io.network.partition.ResultPartition [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688): Releasing PipelinedResultPartition d39c5df41fb51c57ac0e340421e62374#2@90cbf40ffa5c2a89a8cc1a6c5ea9b688 [PIPELINED_BOUNDED, 4 subpartitions, 4 pending consumptions]. 2020-12-29T21:11:45.6888372Z Dec 29 21:11:04 2020-12-29 21:11:00,176 DEBUG org.apache.flink.runtime.io.network.partition.PipelinedSubpartition [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688): Released PipelinedSubpartition#0 [number of buffers: 2 (0 bytes), number of buffers in backlog: 1, finished? true, read view? false]. 2020-12-29T21:11:45.6890284Z Dec 29 21:11:04 2020-12-29 21:11:00,176 DEBUG org.apache.flink.runtime.io.network.partition.PipelinedSubpartition [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688): Released PipelinedSubpartition#1 [number of buffers: 2 (0 bytes), number of buffers in backlog: 1, finished? true, read view? false]. 2020-12-29T21:11:45.6892309Z Dec 29 21:11:04 2020-12-29 21:11:00,176 DEBUG org.apache.flink.runtime.io.network.partition.PipelinedSubpartition [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688): Released PipelinedSubpartition#2 [number of buffers: 2 (0 bytes), number of buffers in backlog: 1, finished? true, read view? false]. 2020-12-29T21:11:45.6894366Z Dec 29 21:11:04 2020-12-29 21:11:00,176 DEBUG org.apache.flink.runtime.io.network.partition.PipelinedSubpartition [] - Source: Custom Source (3/4)#255 (90cbf40ffa5c2a89a8cc1a6c5ea9b688): Released PipelinedSubpartition#3 [number of buffers: 2 (0 bytes), number of buffers in backlog: 1, finished? true, read view? false]. 2020-12-29T21:11:45.6895936Z Dec 29 21:11:04 2020-12-29 21:11:00,176 DEBUG org.apache.flink.runtime.io.network.partition.ResultPartitionManager [] - Released partition d39c5df41fb51c57ac0e340421e62374#2 produced by 90cbf40ffa5c2a89a8cc1a6c5ea9b688. 2020-12-29T21:11:45.6897246Z Dec 29 21:11:04 2020-12-29 21:11:00,176 INFO org.apache.flink.runtime.taskmanager.Task [] - Attempting to cancel task Flat Map -> Sink: Unnamed (3/4)#255 (bc78e4bef29187b8464c79560bbe8306). 2020-12-29T21:11:45.6898580Z Dec 29 21:11:04 2020-12-29 21:11:00,176 INFO org.apache.flink.runtime.taskmanager.Task [] - Flat Map -> Sink: Unnamed (3/4)#255 (bc78e4bef29187b8464c79560bbe8306) switched from RUNNING to CANCELING. 2020-12-29T21:11:45.6899960Z Dec 29 21:11:04 2020-12-29 21:11:00,176 INFO org.apache.flink.runtime.taskmanager.Task [] - Triggering cancellation of task code Flat Map -> Sink: Unnamed (3/4)#255 (bc78e4bef29187b8464c79560bbe8306). 2020-12-29T21:11:45.6901668Z Dec 29 21:11:04 2020-12-29 21:11:00,199 DEBUG org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate [] - Flat Map -> Sink: Unnamed (3/4)#255 (bc78e4bef29187b8464c79560bbe8306): Releasing SingleInputGate{owningTaskName='Flat Map -> Sink: Unnamed (3/4)#255 (bc78e4bef29187b8464c79560bbe8306)', gateIndex=0}. 2020-12-29T21:11:45.6903009Z Dec 29 21:11:04 2020-12-29 21:11:00,202 ERROR org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder [] - Caught unexpected exception. 2020-12-29T21:11:45.6903541Z Dec 29 21:11:04 java.io.IOException: Stream Closed 2020-12-29T21:11:45.6903949Z Dec 29 21:11:04 at java.io.FileInputStream.readBytes(Native Method) ~[?:1.8.0_275] 2020-12-29T21:11:45.6904425Z Dec 29 21:11:04 at java.io.FileInputStream.read(FileInputStream.java:255) ~[?:1.8.0_275] 2020-12-29T21:11:45.6905393Z Dec 29 21:11:04 at org.apache.flink.core.fs.local.LocalDataInputStream.read(LocalDataInputStream.java:73) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6906632Z Dec 29 21:11:04 at org.apache.flink.core.fs.FSDataInputStreamWrapper.read(FSDataInputStreamWrapper.java:60) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6907835Z Dec 29 21:11:04 at org.apache.flink.runtime.util.ForwardingInputStream.read(ForwardingInputStream.java:52) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6908501Z Dec 29 21:11:04 at java.io.DataInputStream.readFully(DataInputStream.java:195) ~[?:1.8.0_275] 2020-12-29T21:11:45.6909188Z Dec 29 21:11:04 at java.io.DataInputStream.readFully(DataInputStream.java:169) ~[?:1.8.0_275] 2020-12-29T21:11:45.6910284Z Dec 29 21:11:04 at org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:82) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6911706Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKVStateData(RocksDBFullRestoreOperation.java:229) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6913102Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBFullRestoreOperation.java:158) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6914761Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restore(RocksDBFullRestoreOperation.java:142) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6916050Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:284) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6917307Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:587) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6918558Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:93) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6919989Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:328) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6921281Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6922514Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6923780Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:345) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6925068Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:163) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6926315Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:272) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6927495Z Dec 29 21:11:04 at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:425) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6928612Z Dec 29 21:11:04 at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$2(StreamTask.java:535) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6929764Z Dec 29 21:11:04 at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6930849Z Dec 29 21:11:04 at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:525) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6931986Z Dec 29 21:11:04 at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:565) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6932962Z Dec 29 21:11:04 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6933872Z Dec 29 21:11:04 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6934651Z Dec 29 21:11:04 at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275] 2020-12-29T21:11:45.6935871Z Dec 29 21:11:04 2020-12-29 21:11:00,202 WARN org.apache.flink.streaming.api.operators.BackendRestorerProcedure [] - Exception while restoring keyed state backend for StreamFlatMap_20ba6b65f97481d5570070de90e4e791_(3/4) from alternative (1/1), will retry while more alternatives are available. 2020-12-29T21:11:45.6936711Z Dec 29 21:11:04 org.apache.flink.runtime.state.BackendBuildingException: Caught unexpected exception. 2020-12-29T21:11:45.6937763Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:361) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6938970Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:587) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6940179Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:93) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6941454Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:328) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6942720Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6943954Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6945206Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:345) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6946495Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:163) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6947738Z Dec 29 21:11:04 at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:272) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6948909Z Dec 29 21:11:04 at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:425) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6950041Z Dec 29 21:11:04 at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$2(StreamTask.java:535) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6951236Z Dec 29 21:11:04 at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6952341Z Dec 29 21:11:04 at org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:525) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6953377Z Dec 29 21:11:04 at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:565) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6954661Z Dec 29 21:11:04 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6955680Z Dec 29 21:11:04 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6956264Z Dec 29 21:11:04 at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275] 2020-12-29T21:11:45.6956676Z Dec 29 21:11:04 Caused by: java.io.IOException: Stream Closed 2020-12-29T21:11:45.6957325Z Dec 29 21:11:04 at java.io.FileInputStream.readBytes(Native Method) ~[?:1.8.0_275] 2020-12-29T21:11:45.6957778Z Dec 29 21:11:04 at java.io.FileInputStream.read(FileInputStream.java:255) ~[?:1.8.0_275] 2020-12-29T21:11:45.6958728Z Dec 29 21:11:04 at org.apache.flink.core.fs.local.LocalDataInputStream.read(LocalDataInputStream.java:73) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6959787Z Dec 29 21:11:04 at org.apache.flink.core.fs.FSDataInputStreamWrapper.read(FSDataInputStreamWrapper.java:60) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6960919Z Dec 29 21:11:04 at org.apache.flink.runtime.util.ForwardingInputStream.read(ForwardingInputStream.java:52) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6961563Z Dec 29 21:11:04 at java.io.DataInputStream.readFully(DataInputStream.java:195) ~[?:1.8.0_275] 2020-12-29T21:11:45.6962092Z Dec 29 21:11:04 at java.io.DataInputStream.readFully(DataInputStream.java:169) ~[?:1.8.0_275] 2020-12-29T21:11:45.6963151Z Dec 29 21:11:04 at org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:82) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6964419Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKVStateData(RocksDBFullRestoreOperation.java:229) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6965751Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBFullRestoreOperation.java:158) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6967017Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restore(RocksDBFullRestoreOperation.java:142) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6968252Z Dec 29 21:11:04 at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:284) ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] 2020-12-29T21:11:45.6968852Z Dec 29 21:11:04 ... 16 more {code} It looks as if one of the source operators finishes before the consumer has been started. [~chesnay] do you have an idea how this can happen? > Local recovery and sticky scheduling end-to-end test timeout with > "IOException: Stream Closed" > ---------------------------------------------------------------------------------------------- > > Key: FLINK-20615 > URL: https://issues.apache.org/jira/browse/FLINK-20615 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends > Affects Versions: 1.12.0, 1.13.0 > Reporter: Huang Xingbo > Priority: Critical > Labels: test-stability > Fix For: 1.13.0, 1.12.2 > > > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=10905&view=logs&j=6caf31d6-847a-526e-9624-468e053467d6&t=0b23652f-b18b-5b6e-6eb6-a11070364610] > It tried to restart many times, and the final error was following: > {code:java} > 2020-12-15T23:54:00.5067862Z Dec 15 23:53:42 2020-12-15 23:53:41,538 ERROR > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder [] - > Caught unexpected exception. > 2020-12-15T23:54:00.5068392Z Dec 15 23:53:42 java.io.IOException: Stream > Closed > 2020-12-15T23:54:00.5068767Z Dec 15 23:53:42 at > java.io.FileInputStream.readBytes(Native Method) ~[?:?] > 2020-12-15T23:54:00.5069223Z Dec 15 23:53:42 at > java.io.FileInputStream.read(FileInputStream.java:279) ~[?:?] > 2020-12-15T23:54:00.5070150Z Dec 15 23:53:42 at > org.apache.flink.core.fs.local.LocalDataInputStream.read(LocalDataInputStream.java:73) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5071217Z Dec 15 23:53:42 at > org.apache.flink.core.fs.FSDataInputStreamWrapper.read(FSDataInputStreamWrapper.java:61) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5072295Z Dec 15 23:53:42 at > org.apache.flink.runtime.util.ForwardingInputStream.read(ForwardingInputStream.java:51) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5072967Z Dec 15 23:53:42 at > java.io.DataInputStream.readFully(DataInputStream.java:200) ~[?:?] > 2020-12-15T23:54:00.5073483Z Dec 15 23:53:42 at > java.io.DataInputStream.readFully(DataInputStream.java:170) ~[?:?] > 2020-12-15T23:54:00.5074535Z Dec 15 23:53:42 at > org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:85) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5075847Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKVStateData(RocksDBFullRestoreOperation.java:222) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5077187Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBFullRestoreOperation.java:169) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5078495Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restore(RocksDBFullRestoreOperation.java:152) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5079802Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:269) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5081013Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:565) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5082215Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:94) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5083500Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:299) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5084899Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5086342Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5087601Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:316) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5088924Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:155) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5090261Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:248) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5091459Z Dec 15 23:53:42 at > org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:400) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5092604Z Dec 15 23:53:42 at > org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$2(StreamTask.java:507) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5093748Z Dec 15 23:53:42 at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47) > [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5094866Z Dec 15 23:53:42 at > org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:501) > [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5095912Z Dec 15 23:53:42 at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:531) > [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5096875Z Dec 15 23:53:42 at > org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) > [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5097814Z Dec 15 23:53:42 at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) > [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5098373Z Dec 15 23:53:42 at > java.lang.Thread.run(Thread.java:834) [?:?] > 2020-12-15T23:54:00.5099549Z Dec 15 23:53:42 2020-12-15 23:53:41,557 WARN > org.apache.flink.streaming.api.operators.BackendRestorerProcedure [] - > Exception while restoring keyed state backend for > StreamFlatMap_20ba6b65f97481d5570070de90e4e791_(1/4) from alternative (1/1), > will retry while more alternatives are available. > 2020-12-15T23:54:00.5100556Z Dec 15 23:53:42 > org.apache.flink.runtime.state.BackendBuildingException: Caught unexpected > exception. > 2020-12-15T23:54:00.5101480Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:328) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5102669Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:565) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5103763Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:94) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5104723Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:299) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5105700Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:142) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5106630Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:121) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5107587Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:316) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5108581Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:155) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5109505Z Dec 15 23:53:42 at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:248) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5110456Z Dec 15 23:53:42 at > org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:400) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5111316Z Dec 15 23:53:42 at > org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$2(StreamTask.java:507) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5112175Z Dec 15 23:53:42 at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47) > [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5113012Z Dec 15 23:53:42 at > org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:501) > [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5113787Z Dec 15 23:53:42 at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:531) > [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5114521Z Dec 15 23:53:42 at > org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) > [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5115209Z Dec 15 23:53:42 at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) > [flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5115635Z Dec 15 23:53:42 at > java.lang.Thread.run(Thread.java:834) [?:?] > 2020-12-15T23:54:00.5115949Z Dec 15 23:53:42 Caused by: java.io.IOException: > Stream Closed > 2020-12-15T23:54:00.5116246Z Dec 15 23:53:42 at > java.io.FileInputStream.readBytes(Native Method) ~[?:?] > 2020-12-15T23:54:00.5116589Z Dec 15 23:53:42 at > java.io.FileInputStream.read(FileInputStream.java:279) ~[?:?] > 2020-12-15T23:54:00.5117284Z Dec 15 23:53:42 at > org.apache.flink.core.fs.local.LocalDataInputStream.read(LocalDataInputStream.java:73) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5118080Z Dec 15 23:53:42 at > org.apache.flink.core.fs.FSDataInputStreamWrapper.read(FSDataInputStreamWrapper.java:61) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5118894Z Dec 15 23:53:42 at > org.apache.flink.runtime.util.ForwardingInputStream.read(ForwardingInputStream.java:51) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5119392Z Dec 15 23:53:42 at > java.io.DataInputStream.readFully(DataInputStream.java:200) ~[?:?] > 2020-12-15T23:54:00.5119808Z Dec 15 23:53:42 at > java.io.DataInputStream.readFully(DataInputStream.java:170) ~[?:?] > 2020-12-15T23:54:00.5120605Z Dec 15 23:53:42 at > org.apache.flink.api.common.typeutils.base.array.BytePrimitiveArraySerializer.deserialize(BytePrimitiveArraySerializer.java:85) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5121576Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKVStateData(RocksDBFullRestoreOperation.java:222) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5122579Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBFullRestoreOperation.java:169) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5123543Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.restore.RocksDBFullRestoreOperation.restore(RocksDBFullRestoreOperation.java:152) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5124476Z Dec 15 23:53:42 at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:269) > ~[flink-dist_2.11-1.13-SNAPSHOT.jar:1.13-SNAPSHOT] > 2020-12-15T23:54:00.5124994Z Dec 15 23:53:42 ... 16 more > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)