Everything is fine. Merged to master branch. On Fri, Jan 10, 2020 at 9:48 AM Anton Vinogradov <a...@apache.org> wrote:
> >> Does the issue reproduce in > >> subsequent runs? > Unfortunately no. > We performed 30+ runs without "success". > > >> I think we can add an assertion to > >> GridDhtLocalPartition#destroy() method to check that reservations is 0 > Ok, I will check and merge in case of success. > Created the Issue to handle this [1]. > > [1] https://issues.apache.org/jira/browse/IGNITE-12524 > > On Thu, Jan 9, 2020 at 1:46 PM Alexey Goncharuk < > alexey.goncha...@gmail.com> wrote: > >> Hello Anton, >> >> Thanks for digging into this. The logic with checking the >> reservations count seems fishy to me as well, so I have no objections with >> the suggested change. This "if" statement does not answer why the >> partition >> was being destroyed during the commit, though. Does the issue reproduce in >> subsequent runs? >> >> The logic around reserve/release seems ok to me, however, the >> eviction/renting code looks overly complicated, perhaps, there is a bug >> somewhere there? I think we can add an assertion to >> GridDhtLocalPartition#destroy() method to check that reservations is 0 >> when >> this method is called (there is a check for EVICTED state already there) >> >> --AG >> >> чт, 9 янв. 2020 г. в 09:45, Anton Vinogradov <a...@apache.org>: >> >> > Folks, >> > Yardstick run (opt-serial-put-get-1-backup) failed with interesting >> > exception: >> > Critical system error detected. Will be handled accordingly to >> configured >> > handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, >> > super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet >> > [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], >> > failureCtx=FailureContext [type=CRITICAL_ERROR, err=class >> > o.a.i.i.transactions.IgniteTxHeuristicCheckedException: Committing a >> > transaction has produced runtime exception]] >> > class >> > >> org.apache.ignite.internal.transactions.IgniteTxHeuristicCheckedException: >> > Committing a transaction has produced runtime exception >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.transactions.IgniteTxAdapter.heuristicException(IgniteTxAdapter.java:800) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitIfLocked(GridDistributedTxRemoteAdapter.java:838) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitRemoteTx(GridDistributedTxRemoteAdapter.java:893) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.finish(IgniteTxHandler.java:1452) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxFinishRequest(IgniteTxHandler.java:1375) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$600(IgniteTxHandler.java:123) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$7.apply(IgniteTxHandler.java:241) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$7.apply(IgniteTxHandler.java:239) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1142) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:591) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:392) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:318) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:109) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:308) >> > at >> > >> > >> org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1843) >> > at >> > >> > >> org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1468) >> > at >> > >> > >> org.apache.ignite.internal.managers.communication.GridIoManager.access$5200(GridIoManager.java:229) >> > at >> > >> > >> org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1365) >> > at >> > >> > >> org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:555) >> > at >> > >> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) >> > at java.lang.Thread.run(Thread.java:748) >> > Caused by: java.lang.IllegalStateException: Tree is being concurrently >> > destroyed: tx-p-470##CacheData >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.checkDestroyed(BPlusTree.java:1011) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1831) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke0(IgniteCacheOffheapManagerImpl.java:1696) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.invoke(IgniteCacheOffheapManagerImpl.java:1679) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.invoke(IgniteCacheOffheapManagerImpl.java:441) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4288) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.storeValue(GridCacheMapEntry.java:4262) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1540) >> > at >> > >> > >> org.apache.ignite.internal.processors.cache.distributed.GridDistributedTxRemoteAdapter.commitIfLocked(GridDistributedTxRemoteAdapter.java:675) >> > ... 19 more >> > >> > It seems, BPlusTree was destroyed between >> > GridDistributedTxRemoteAdapter.java:545 and >> > GridDistributedTxRemoteAdapter.java:675 while partition was reserved. >> > >> > See the full log [1] for details. >> > >> > During investigation weird code was found: >> > private void release0(int sizeChange) { >> > while (true) { >> > long state = this.state.get(); >> > >> > int reservations = getReservations(state); >> > >> > if (reservations == 0) // How can it be zero at release >> > attempt? >> > return; >> > >> > I've replaced this weird code with assertion [2] and checked at TeamCity >> > twice, nothing failed. >> > >> > So, questions >> > 1) Any Idea why we able to have zero reservations at release attempt? >> > 2) Any objection to merging assertion instead of weird return to the >> master >> > branch? >> > 3) Any Idea why the exception happens? >> > >> > [1] >> > >> > >> https://gist.githubusercontent.com/anton-vinogradov/834fc63114a3e8d46b89ea4ccec8148b/raw/6438930c7fef119d0ad60df76d821fe7bd100c5e/gistfile1.txt >> > [2] >> > >> > >> https://gitbox.apache.org/repos/asf?p=ignite.git;a=commitdiff;h=b2c083564fb3b48ebe87042e0ed442dc0af3a74d >> > >> >