[ https://issues.apache.org/jira/browse/FLINK-28440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17814205#comment-17814205 ]
Matthias Pohl commented on FLINK-28440: --------------------------------------- https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57274&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=8885 {code} Feb 05 03:47:04 03:47:04.282 [ERROR] Tests run: 9, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 29.92 s <<< FAILURE! -- in org.apache.flink.test.checkpointing.ChangelogRecoveryITCase Feb 05 03:47:04 03:47:04.282 [ERROR] org.apache.flink.test.checkpointing.ChangelogRecoveryITCase.testMaterialization[delegated state backend type = EmbeddedRocksDBStateBackend{, localRocksDbDirectories=null, enableIncrementalCheckpointing=TRUE, numberOfTransferThreads=-1, writeBatchSize=-1}] -- Time elapsed: 5.361 s <<< ERROR! Feb 05 03:47:04 org.apache.flink.runtime.JobException: Recovery is suppressed by FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=2, backoffTimeMS=0) Feb 05 03:47:04 at org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:180) Feb 05 03:47:04 at org.apache.flink.runtime.executiongraph.failover.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:107) Feb 05 03:47:04 at org.apache.flink.runtime.scheduler.DefaultScheduler.recordTaskFailure(DefaultScheduler.java:277) Feb 05 03:47:04 at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:268) Feb 05 03:47:04 at org.apache.flink.runtime.scheduler.DefaultScheduler.onTaskFailed(DefaultScheduler.java:261) Feb 05 03:47:04 at org.apache.flink.runtime.scheduler.SchedulerBase.onTaskExecutionStateUpdate(SchedulerBase.java:787) Feb 05 03:47:04 at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:764) Feb 05 03:47:04 at org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:83) Feb 05 03:47:04 at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:488) Feb 05 03:47:04 at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) [...] Feb 05 03:47:04 at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175) Feb 05 03:47:04 Caused by: java.lang.Exception: Exception while creating StreamOperatorStateContext. Feb 05 03:47:04 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:294) Feb 05 03:47:04 at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:266) Feb 05 03:47:04 at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) Feb 05 03:47:04 at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreStateAndGates(StreamTask.java:799) Feb 05 03:47:04 at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$restoreInternal$3(StreamTask.java:753) Feb 05 03:47:04 at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) Feb 05 03:47:04 at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:753) Feb 05 03:47:04 at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:712) Feb 05 03:47:04 at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958) Feb 05 03:47:04 at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) Feb 05 03:47:04 at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:751) Feb 05 03:47:04 at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) Feb 05 03:47:04 at java.lang.Thread.run(Thread.java:748) Feb 05 03:47:04 Caused by: org.apache.flink.util.FlinkException: Could not restore keyed state backend for WindowOperator_08a489791a4e7fcd83ae029ef13928c6_(4/4) from any of the 1 provided restore options. Feb 05 03:47:04 at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:165) Feb 05 03:47:04 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:399) Feb 05 03:47:04 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:180) Feb 05 03:47:04 ... 12 more Feb 05 03:47:04 Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: /tmp/junit71073126596521536/junit200918151019600319/3d58091e2f5857dd929374a673752361/dstl/ed44af84-4e46-4527-a2f6-e102fc710043 (No such file or directory) Feb 05 03:47:04 at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) Feb 05 03:47:04 at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87) Feb 05 03:47:04 at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69) Feb 05 03:47:04 at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:107) Feb 05 03:47:04 at org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:78) Feb 05 03:47:04 at org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:94) Feb 05 03:47:04 at org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:81) Feb 05 03:47:04 at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$3(StreamTaskStateInitializerImpl.java:393) Feb 05 03:47:04 at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:173) Feb 05 03:47:04 at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:137) Feb 05 03:47:04 ... 14 more Feb 05 03:47:04 Caused by: java.io.FileNotFoundException: /tmp/junit71073126596521536/junit200918151019600319/3d58091e2f5857dd929374a673752361/dstl/ed44af84-4e46-4527-a2f6-e102fc710043 (No such file or directory) Feb 05 03:47:04 at java.io.FileInputStream.open0(Native Method) Feb 05 03:47:04 at java.io.FileInputStream.open(FileInputStream.java:195) Feb 05 03:47:04 at java.io.FileInputStream.<init>(FileInputStream.java:138) Feb 05 03:47:04 at org.apache.flink.core.fs.local.LocalDataInputStream.<init>(LocalDataInputStream.java:50) Feb 05 03:47:04 at org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:134) Feb 05 03:47:04 at org.apache.flink.core.fs.SafetyNetWrapperFileSystem.open(SafetyNetWrapperFileSystem.java:87) Feb 05 03:47:04 at org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:72) Feb 05 03:47:04 at org.apache.flink.changelog.fs.ChangelogStreamHandleReaderWithCache.openAndSeek(ChangelogStreamHandleReaderWithCache.java:89) Feb 05 03:47:04 at org.apache.flink.changelog.fs.StateChangeIteratorImpl.read(StateChangeIteratorImpl.java:42) Feb 05 03:47:04 at org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85) Feb 05 03:47:04 ... 22 more {code} > EventTimeWindowCheckpointingITCase failed with restore > ------------------------------------------------------ > > Key: FLINK-28440 > URL: https://issues.apache.org/jira/browse/FLINK-28440 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing, Runtime / State Backends > Affects Versions: 1.16.0, 1.17.0, 1.18.0, 1.19.0 > Reporter: Huang Xingbo > Assignee: Yanfei Lei > Priority: Critical > Labels: auto-deprioritized-critical, pull-request-available, > stale-assigned, test-stability > Fix For: 1.19.0 > > Attachments: image-2023-02-01-00-51-54-506.png, > image-2023-02-01-01-10-01-521.png, image-2023-02-01-01-19-12-182.png, > image-2023-02-01-16-47-23-756.png, image-2023-02-01-16-57-43-889.png, > image-2023-02-02-10-52-56-599.png, image-2023-02-03-10-09-07-586.png, > image-2023-02-03-12-03-16-155.png, image-2023-02-03-12-03-56-614.png > > > {code:java} > Caused by: java.lang.Exception: Exception while creating > StreamOperatorStateContext. > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:256) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:268) > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:722) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:698) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:665) > at > org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935) > at > org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:904) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.flink.util.FlinkException: Could not restore keyed > state backend for WindowOperator_0a448493b4782967b150582570326227_(2/4) from > any of the 1 provided restore options. > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:160) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:353) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:165) > ... 11 more > Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: > /tmp/junit1835099326935900400/junit1113650082510421526/52ee65b7-033f-4429-8ddd-adbe85e27ced > (No such file or directory) > at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:321) > at > org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:87) > at > org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.hasNext(StateChangelogHandleStreamHandleReader.java:69) > at > org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.readBackendHandle(ChangelogBackendRestoreOperation.java:96) > at > org.apache.flink.state.changelog.restore.ChangelogBackendRestoreOperation.restore(ChangelogBackendRestoreOperation.java:75) > at > org.apache.flink.state.changelog.ChangelogStateBackend.restore(ChangelogStateBackend.java:92) > at > org.apache.flink.state.changelog.AbstractChangelogStateBackend.createKeyedStateBackend(AbstractChangelogStateBackend.java:136) > at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:336) > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) > at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) > ... 13 more > Caused by: java.io.FileNotFoundException: > /tmp/junit1835099326935900400/junit1113650082510421526/52ee65b7-033f-4429-8ddd-adbe85e27ced > (No such file or directory) > at java.io.FileInputStream.open0(Native Method) > at java.io.FileInputStream.open(FileInputStream.java:195) > at java.io.FileInputStream.<init>(FileInputStream.java:138) > at > org.apache.flink.core.fs.local.LocalDataInputStream.<init>(LocalDataInputStream.java:50) > at > org.apache.flink.core.fs.local.LocalFileSystem.open(LocalFileSystem.java:134) > at > org.apache.flink.core.fs.SafetyNetWrapperFileSystem.open(SafetyNetWrapperFileSystem.java:87) > at > org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:72) > at > org.apache.flink.changelog.fs.StateChangeFormat.read(StateChangeFormat.java:92) > at > org.apache.flink.runtime.state.changelog.StateChangelogHandleStreamHandleReader$1.advance(StateChangelogHandleStreamHandleReader.java:85) > ... 21 more > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=37772&view=logs&j=4d4a0d10-fca2-5507-8eed-c07f0bdf4887&t=7b25afdf-cc6c-566f-5459-359dc2585798&l=8916 > Other tests where this stacktrace was observed in test failures is > {{ChangelogRecoveryITCase}} (FLINK-30107) and > {{ChangelogRecoverySwitchStateBackendITCase}} (FLINK-28898). -- This message was sent by Atlassian Jira (v8.20.10#820010)