[ https://issues.apache.org/jira/browse/FLINK-26196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hangxiang Yu closed FLINK-26196. -------------------------------- Resolution: Cannot Reproduce Hi, [~hjw] I closed this due to lacking more specific information and no response beyond one year. Please reopen it if you could offer more. e.g. JM/TM logs, codes could reproduce. > error when Incremental Checkpoints by RocksDb > ----------------------------------------------- > > Key: FLINK-26196 > URL: https://issues.apache.org/jira/browse/FLINK-26196 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends > Affects Versions: 1.13.2 > Reporter: hjw > Priority: Critical > Attachments: image-2022-02-22-12-02-46-835.png > > > When I use Incremental Checkpoints by RocksDb , errors happen occasionally. > Fortunately,Flink job is running normally > Log: > {code:java} > java.io.IOException: Could not perform checkpoint 2804 for operator > cc-rule-keyByAndReduceStream (2/8)#1. > at > org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1045) > at > org.apache.flink.streaming.runtime.io.checkpointing.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:135) > at > org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.triggerCheckpoint(SingleCheckpointBarrierHandler.java:250) > at > org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.access$100(SingleCheckpointBarrierHandler.java:61) > at > org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler$ControllerImpl.triggerGlobalCheckpoint(SingleCheckpointBarrierHandler.java:431) > at > org.apache.flink.streaming.runtime.io.checkpointing.AbstractAlignedBarrierHandlerState.barrierReceived(AbstractAlignedBarrierHandlerState.java:61) > at > org.apache.flink.streaming.runtime.io.checkpointing.SingleCheckpointBarrierHandler.processBarrier(SingleCheckpointBarrierHandler.java:227) > at > org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:180) > at > org.apache.flink.streaming.runtime.io.checkpointing.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:158) > at > org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110) > at > org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:66) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:423) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:204) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:681) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:636) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:620) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.flink.runtime.checkpoint.CheckpointException: Could not > complete snapshot 2804 for operator cc-rule-keyByAndReduceStream (2/8)#1. > Failure reason: Checkpoint was declined. > at > org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:264) > at > org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:169) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:371) > at > org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointStreamOperator(SubtaskCheckpointCoordinatorImpl.java:706) > at > org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.buildOperatorSnapshotFutures(SubtaskCheckpointCoordinatorImpl.java:627) > at > org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.takeSnapshotSync(SubtaskCheckpointCoordinatorImpl.java:590) > at > org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:312) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$8(StreamTask.java:1089) > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:1073) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:1029) > ... 19 more > Caused by: org.rocksdb.RocksDBException: while link file to > /opt/flink/log/iodir/flink-io-1c4c28bd-c5ce-4c07-9d33-81d480ec5216/job_b9a574334a212349298b39d98567b519_op_WindowOperator_306d8342cb5b2ad8b53f1be57f65bee8__2_8__uuid_c9adab75-3696-4342-9019-e8477cf0a7ca/chk-2804.tmp/000279.sst: > > /opt/flink/log/iodir/flink-io-1c4c28bd-c5ce-4c07-9d33-81d480ec5216/job_b9a574334a212349298b39d98567b519_op_WindowOperator_306d8342cb5b2ad8b53f1be57f65bee8__2_8__uuid_c9adab75-3696-4342-9019-e8477cf0a7ca/db/000279.sst: > File exists > at org.rocksdb.Checkpoint.createCheckpoint(Native Method) > at org.rocksdb.Checkpoint.createCheckpoint(Checkpoint.java:51) > at > org.apache.flink.contrib.streaming.state.snapshot.RocksIncrementalSnapshotStrategy.takeDBNativeCheckpoint(RocksIncrementalSnapshotStrategy.java:288) > at > org.apache.flink.contrib.streaming.state.snapshot.RocksIncrementalSnapshotStrategy.syncPrepareResources(RocksIncrementalSnapshotStrategy.java:157) > at > org.apache.flink.contrib.streaming.state.snapshot.RocksIncrementalSnapshotStrategy.syncPrepareResources(RocksIncrementalSnapshotStrategy.java:83) > at > org.apache.flink.runtime.state.SnapshotStrategyRunner.snapshot(SnapshotStrategyRunner.java:77) > at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.snapshot(RocksDBKeyedStateBackend.java:551) > at > org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:241) > ... 29 more > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)