Hi There,
We had a unresponsive cluster today after the following error;
[2019-10-09T07:08:13,623][ERROR][sys-stripe-94-#95][GridCacheDatabaseSharedManager]
Checkpoint read lock acquisition has been timed out.
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$CheckpointReadLockTimeoutException:
Checkpoint read lock acquisition has been timed out.
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.failCheckpointReadLock(GridCacheDatabaseSharedManager.java:1564)
~[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1497)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1739)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1668)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3138)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:135)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:271)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:266)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1569)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1197)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:127)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1093)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:505)
[ignite-core-2.7.6.jar:2.7.6]
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
[ignite-core-2.7.6.jar:2.7.6]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
After this log cluster went into infinite loop somehow and became
unresponsive. Since log files are bigger than 5MB, I am sharing google-drive
link for all log files.
https://drive.google.com/drive/folders/1XHaw2YZq3_F4CMw8m_mJZkUz1K17njU9?usp=sharing
any help appriciated
thanks
-----
İbrahim Halil Altun
Senior Software Engineer @ Segmentify
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/