[ 
https://issues.apache.org/jira/browse/IGNITE-14684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maria Makedonskaya updated IGNITE-14684:
----------------------------------------
    Description: 
Checkpoint listener 
org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
 which trigger at the end of checkpoint process can not take checkpoint read 
lock during node stoppingCheckpoint listener 
org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
 which trigger at the end of checkpoint process can not take checkpoint read 
lock during node stopping
 Run test(see exception in 
log):org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest#testDestroyTaskLifecycle
{noformat}
[2021-05-05 
15:41:10,907][ERROR][db-checkpoint-thread-#87%db.LongDestroyDurableBackgroundTaskTest0%|#87%db.LongDestroyDurableBackgroundTaskTest0%][root]
 Critical system error detected. Will be handled accordingly to configured 
handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
[ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
[type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Failed to 
perform cache update: node is stopping.]]class 
org.apache.ignite.IgniteException: Failed to perform cache update: node is 
stopping. at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:127)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1583)
 at 
org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.metaStorageOperation(DurableBackgroundTasksProcessor.java:335)
 at 
org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.afterCheckpointEnd(DurableBackgroundTasksProcessor.java:152)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointEnd(CheckpointWorkflow.java:606)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:479)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:282)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) 
at java.lang.Thread.run(Thread.java:748)Caused by: class 
org.apache.ignite.internal.NodeStoppingException: Failed to perform cache 
update: node is stopping. ... 9 more
{noformat}

  was:
Checkpoint listener 
org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
 which trigger at the end of checkpoint process can not take checkpoint read 
lock during node stoppingCheckpoint listener 
org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
 which trigger at the end of checkpoint process can not take checkpoint read 
lock during node stopping
Run test(see exception in 
log):org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest#testDestroyTaskLifecycle
{noformat}[2021-05-05 
15:41:10,907][ERROR][db-checkpoint-thread-#87%db.LongDestroyDurableBackgroundTaskTest0%][root]
 Critical system error detected. Will be handled accordingly to configured 
handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
[ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
[type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Failed to 
perform cache update: node is stopping.]]class 
org.apache.ignite.IgniteException: Failed to perform cache update: node is 
stopping. at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:127)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1583)
 at 
org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.metaStorageOperation(DurableBackgroundTasksProcessor.java:335)
 at 
org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.afterCheckpointEnd(DurableBackgroundTasksProcessor.java:152)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointEnd(CheckpointWorkflow.java:606)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:479)
 at 
org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:282)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) 
at java.lang.Thread.run(Thread.java:748)Caused by: class 
org.apache.ignite.internal.NodeStoppingException: Failed to perform cache 
update: node is stopping. ... 9 more\{noformat}


> Stopping node at the end of checkpoint can cause "Critical system error"
> ------------------------------------------------------------------------
>
>                 Key: IGNITE-14684
>                 URL: https://issues.apache.org/jira/browse/IGNITE-14684
>             Project: Ignite
>          Issue Type: Bug
>          Components: persistence
>            Reporter: Maria Makedonskaya
>            Assignee: Kirill Tkalenko
>            Priority: Major
>
> Checkpoint listener 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
>  which trigger at the end of checkpoint process can not take checkpoint read 
> lock during node stoppingCheckpoint listener 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor#afterCheckpointEnd
>  which trigger at the end of checkpoint process can not take checkpoint read 
> lock during node stopping
>  Run test(see exception in 
> log):org.apache.ignite.internal.processors.cache.persistence.db.LongDestroyDurableBackgroundTaskTest#testDestroyTaskLifecycle
> {noformat}
> [2021-05-05 
> 15:41:10,907][ERROR][db-checkpoint-thread-#87%db.LongDestroyDurableBackgroundTaskTest0%|#87%db.LongDestroyDurableBackgroundTaskTest0%][root]
>  Critical system error detected. Will be handled accordingly to configured 
> handler [hnd=StopNodeFailureHandler [super=AbstractFailureHandler 
> [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, 
> SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext 
> [type=SYSTEM_WORKER_TERMINATION, err=class o.a.i.IgniteException: Failed to 
> perform cache update: node is stopping.]]class 
> org.apache.ignite.IgniteException: Failed to perform cache update: node is 
> stopping. at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:127)
>  at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1583)
>  at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.metaStorageOperation(DurableBackgroundTasksProcessor.java:335)
>  at 
> org.apache.ignite.internal.processors.localtask.DurableBackgroundTasksProcessor.afterCheckpointEnd(DurableBackgroundTasksProcessor.java:152)
>  at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.CheckpointWorkflow.markCheckpointEnd(CheckpointWorkflow.java:606)
>  at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:479)
>  at 
> org.apache.ignite.internal.processors.cache.persistence.checkpoint.Checkpointer.body(Checkpointer.java:282)
>  at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119) at 
> java.lang.Thread.run(Thread.java:748)Caused by: class 
> org.apache.ignite.internal.NodeStoppingException: Failed to perform cache 
> update: node is stopping. ... 9 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to