Hi Ilya,

I have tried to run this with both ignite 2.10 and 2.11
There are 24 thin clients on different VMs which are trying to perform
*CRUD* operations on the *same AND/OR different caches* parallelly.
If I set the *TxTimeoutOnPartitionMapExchange *property to some value, then
the transactions started by the thin clients gets timed out. If this
property is not set, then transactions don't time out or rolled back and in
this case cluster becomes unresponsive.

Is it advisable to always have TxTimeoutOnPartitionMapExchange set to some
value ?

On Wed, Nov 3, 2021 at 7:17 AM Ilya Korol <llivezk...@gmail.com> wrote:

> Hi Sumit,
>
> What is Ignite version that you use?
> AFAIK partition map exchange is a king of "stop the world" actiity for
> the cluster, so any other actions with cluster (like cache creation)
> would be suspended until PME end.
> If all of your clients concurrently try to create same cache it's OK
> that there are some rolledback transactions, but if your cluster became
> unresponsive after that this looks like a bug, so you can submit a JIRA
> ticket, with steps to reproduce this issue.
>
> But if you experience a deadlock thic looks like a bug.
>
> On 2021/11/02 13:45:49 Sumit Deshinge wrote:
>  > Hi,
>  >
>  > I have apache ignite cluster of 3 ignite server and more than 20 ignite
>  > thin clients (each thin client being on separate VM). These thin clients
>  > are trying to create caches at approximately the same time parallely and
>  > also starting with cache CRUD operations after that.
>  >
>  > Looks like partition map exchange process and cache CRUD operations in
>  > parallel are causing deadlock or lock acquire failures.
>  >
>  > *What should be the strategy to handle this scenario ?*
>  >
>  > Ignite server has below errors:
>  >
>  > *Exception stack trace 1:*
>  >
>  > WARNING: Dumping the near node thread that started transaction
>  > [xidVer=GridCacheVersion [topVer=247332659, order=1635852705217,
>  > nodeOrder=1], nodeId=2735bef0-7404-41e3-843f-7043490c9d84]
>  > Stack trace of the transaction owner thread:
>  > Thread [name="client-connector-#56%perf-dn1%", id=93, state=WAITING,
>  > blockCnt=5023, waitCnt=36165]
>  > at sun.misc.Unsafe.park(Native Method)
>  > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>  > at
> o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>  > at o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>  > at
> o.a.i.i.processors.cache.GridCacheAdapter$41.op(GridCacheAdapter.java:3430)
>  > at
> o.a.i.i.processors.cache.GridCacheAdapter$41.op(GridCacheAdapter.java:3423)
>  > at
>
> o.a.i.i.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4480)
>  > at
>
> o.a.i.i.processors.cache.GridCacheAdapter.remove0(GridCacheAdapter.java:3423)
>  > at
>
> o.a.i.i.processors.cache.GridCacheAdapter.remove(GridCacheAdapter.java:3405)
>  > at
>
> o.a.i.i.processors.cache.GridCacheAdapter.remove(GridCacheAdapter.java:3388)
>  > at
>
> o.a.i.i.processors.cache.IgniteCacheProxyImpl.remove(IgniteCacheProxyImpl.java:1438)
>  > at
>
> o.a.i.i.processors.cache.GatewayProtectedCacheProxy.remove(GatewayProtectedCacheProxy.java:964)
>  > at
>
> o.a.i.i.processors.platform.client.cache.ClientCacheRemoveKeyRequest.process(ClientCacheRemoveKeyRequest.java:41)
>  > at
>
> o.a.i.i.processors.platform.client.ClientRequestHandler.handle(ClientRequestHandler.java:77)
>  > at
>
> o.a.i.i.processors.odbc.ClientListenerNioListener.onMessage(ClientListenerNioListener.java:204)
>  > at
>
> o.a.i.i.processors.odbc.ClientListenerNioListener.onMessage(ClientListenerNioListener.java:55)
>  > at
>
> o.a.i.i.util.nio.GridNioFilterChain$TailFilter.onMessageReceived(GridNioFilterChain.java:279)
>  > at
>
> o.a.i.i.util.nio.GridNioFilterAdapter.proceedMessageReceived(GridNioFilterAdapter.java:109)
>  > at
>
> o.a.i.i.util.nio.GridNioAsyncNotifyFilter$3.body(GridNioAsyncNotifyFilter.java:97)
>  > at o.a.i.i.util.worker.GridWorker.run(GridWorker.java:120)
>  > at o.a.i.i.util.worker.GridWorkerPool$1.run(GridWorkerPool.java:70)
>  > at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  > at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  > at java.lang.Thread.run(Thread.java:748)
>  >
>  > *Exception stack trace 2:*
>  >
>  > WARNING: >>> Transaction [startTime=11:39:27.214,
>  > curTime=11:40:36.277, systemTime=0, userTime=69063, tx=GridNearTxLocal
>  > [mappings=IgniteTxMappingsImpl [], nearLocallyMapped=false,
>  > colocatedLocallyMapped=false, needCheckBackup=null,
>  > hasRemoteLocks=false, trackTimeout=false, systemTime=44700,
>  > systemStartTime=0, prepareStartTime=0, prepareTime=0,
>  > commitOrRollbackStartTime=0, commitOrRollbackTime=0, lb=null,
>  > mvccOp=null, qryId=-1, crdVer=0,
>  > thread=client-connector-#57%perf-dn1%, mappings=IgniteTxMappingsImpl
>  > [], super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false,
>  > span=o.a.i.i.processors.tracing.NoopSpan@4a931268,
>  > nearNodes=KeySetView [], dhtNodes=KeySetView [], explicitLock=false,
>  > super=IgniteTxLocalAdapter [completedBase=null,
>  > sndTransformedVals=false, depEnabled=false, txState=IgniteTxStateImpl
>  > [activeCacheIds=[], recovery=null, mvccEnabled=null,
>  > mvccCachingCacheIds=[], txMap=EmptySet []], super=IgniteTxAdapter
>  > [xidVer=GridCacheVersion [topVer=247332659, order=1635852705226,
>  > nodeOrder=1], writeVer=null, implicit=false, loc=true, threadId=95,
>  > startTime=1635853167214, nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0,
>  > sysInvalidate=false, sys=false, plc=2, commitVer=null,
>  > finalizing=NONE, invalidParts=null, state=SUSPENDED, timedOut=false,
>  > topVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0],
>  > mvccSnapshot=null, skipCompletedVers=false, parentTx=null,
>  > duration=69079ms, onePhaseCommit=false], size=0]]]]
>  > Nov 2, 2021 11:40:36 AM org.apache.ignite.logger.java.JavaLogger warning
>  > WARNING: First 10 long running cache futures [total=16]
>  > Nov 2, 2021 11:40:36 AM org.apache.ignite.logger.java.JavaLogger warning
>  > WARNING: >>> Future [startTime=11:39:27.324, curTime=11:40:36.277,
>  > fut=GridDhtLockFuture
>  > [span=o.a.i.i.processors.tracing.NoopSpan@4a931268,
>  > nearNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > nearLockVer=GridCacheVersion [topVer=247332659, order=1635852705208,
>  > nodeOrder=1], topVer=AffinityTopologyVersion [topVer=1,
>  > minorTopVer=162], threadId=124,
>  > futId=be58a60ec71-1d64903c-c700-4deb-bace-cc5158713120,
>  > lockVer=GridCacheVersion [topVer=247332659, order=1635852705208,
>  > nodeOrder=1], read=false, err=null, timedOut=false, timeout=0,
>  > tx=GridNearTxLocal [mappings=IgniteTxMappingsImpl []dNearNodes=null,
>  > ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
>  > nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true],
>  >
>
> masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
>  > prevVer=null, nextVer=null]], rmts=null]], flags=3]]], prepared=0,
>  > locked=false, nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=2,
>  > partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion
>  > [topVer=247332659, order=1635852705208, nodeOrder=1]]]],
>  > super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=247332659,
>  > order=1635852705208, nodeOrder=1], writeVer=null, implicit=false,
>  > loc=true, threadId=124, startTime=1635853167214,
>  > nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0,
>  > sysInvalidate=false, sys=false, plc=2, commitVer=null,
>  > finalizing=NONE, invalidParts=null, state=ACTIVE, timedOut=false,
>  > topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
>  > mvccSnapshot=null, skipCompletedVers=false, parentTx=null,
>  > duration=69094ms, onePhaseCommit=false], size=1]],
>  > nearLocallyMapped=false, colocatedLocallyMapped=true,
>  > needCheckBackup=null, hasRemoteLocks=false, trackTimeout=false,
>  > systemTime=75000, systemStartTime=971108549857700, prepareStartTime=0,
>  > prepareTime=0, commitOrRollbackStartTime=0, commitOrRollbackTime=0,
>  > lb=null, mvccOp=null, qryId=-1, crdVer=0,
>  > thread=client-connector-#84%perf-dn1%, mappings=IgniteTxMappingsImpl
>  > []dNearNodes=null, ownerVer=GridCacheVersion [topVer=247332659,
>  > order=1635852705211, nodeOrder=1], serOrder=null,
>  > key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true],
>  >
>
> masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
>  > prevVer=null, nextVer=null]], rmts=null]], flags=3]]], prepared=0,
>  > locked=false, nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=2,
>  > partUpdateCntr=0, serReadVer=null, xidVer=GridCacheVersion
>  > [topVer=247332659, order=1635852705208, nodeOrder=1]]]],
>  > super=IgniteTxAdapter [xidVer=GridCacheVersion [topVer=247332659,
>  > order=1635852705208, nodeOrder=1], writeVer=null, implicit=false,
>  > loc=true, threadId=124, startTime=1635853167214,
>  > nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > isolation=REPEATABLE_READ, concurrency=PESSIMISTIC, timeout=0,
>  > sysInvalidate=false, sys=false, plc=2, commitVer=null,
>  > finalizing=NONE, invalidParts=null, state=ACTIVE, timedOut=false,
>  > topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
>  > mvccSnapshot=null, skipCompletedVers=false, parentTx=null,
>  > duration=69094ms, onePhaseCommit=false], size=1]],
>  > super=GridDhtTxLocalAdapter [nearOnOriginatingNode=false,
>  > span=o.a.i.i.processors.tracing.NoopSpan@4a931268,
>  > nearNodes=KeySetView [], dhtNodes=KeySetView [], explicitLock=false,
>  > super=IgniteTxLocalAdapter [completedBase=null,
>  > sndTransformedVals=false, depEnabled=false, txState=IgniteTxStateImpl
>  > [activeCacheIds=[585748697], recovery=false, mvccEnabled=false,
>  > mvccCachingCacheIds=[], txMap=ArrayList [IgniteTxEntry
>  > [txKey=IgniteTxKey [key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true], cacheId=585748697], val=TxEntryValueHolder
>  > [val=null, op=DELETE], prevVal=TxEntryValueHolder [val=null, op=NOOP],
>  > oldVal=TxEntryValueHolder [val=null, op=NOOP],
>  > entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1,
>  > conflictVer=null, explicitVer=null, dhtVer=null,
>  > filters=CacheEntryPredicate[] [], filtersPassed=false,
>  > filtersSet=true, entry=GridDhtCacheEntry [rdrs=ReaderId[] [],
>  > part=244, super=GridDistributedCacheEntry [super=GridCacheMapEntry
>  > [key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true], val=null, ver=GridCacheVersion [topVer=247332659,
>  > order=1635852705229, nodeOrder=1], hash=1085684290,
>  > extras=GridCacheMvccEntryExtras [mvcc=GridCacheMvcc [locs=LinkedList
>  > [GridCacheMvccCandidate [nodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > ver=GridCacheVersion [topVer=247332659, order=1635852705207,
>  > nodeOrder=1], threadId=122, id=2104, topVer=AffinityTopologyVersion
>  > [topVer=1, minorTopVer=162], reentry=null,
>  > otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > otherVer=GridCacheVersion [topVer=247332659, order=1635852705207,
>  > nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
>  > ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
>  > nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true],
>  >
>
> masks=local=1|owner=1|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
>  > prevVer=null, nextVer=null], GridCacheMvccCandidate
>  > [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
>  > [topVer=247332659, order=1635852705208, nodeOrder=1], threadId=124,
>  > id=2102, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
>  > reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > otherVer=GridCacheVersion [topVer=247332659, order=1635852705208,
>  > nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
>  > ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
>  > nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true],
>  >
>
> masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
>  > prevVer=null, nextVer=null], GridCacheMvccCandidate
>  > [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
>  > [topVer=247332659, order=1635852705213, nodeOrder=1], threadId=122,
>  > id=2120, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
>  > reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > otherVer=GridCacheVersion [topVer=247332659, order=1635852705213,
>  > nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
>  > ownerVer=GridCacheVersion [topVer=247332659, order=1635852705207,
>  > nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true],
>  >
>
> masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
>  > prevVer=null, nextVer=null], GridCacheMvccCandidate
>  > [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
>  > [topVer=247332659, order=1635852705214, nodeOrder=1], threadId=123,
>  > id=2118, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
>  > reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > otherVer=GridCacheVersion [topVer=247332659, order=1635852705214,
>  > nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
>  > ownerVer=GridCacheVersion [topVer=247332659, order=1635852705207,
>  > nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true],
>  >
>
> masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
>  > prevVer=null, nextVer=null], GridCacheMvccCandidate
>  > [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
>  > [topVer=247332659, order=1635852705217, nodeOrder=1], threadId=93,
>  > id=2108, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
>  > reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > otherVer=GridCacheVersion [topVer=247332659, order=1635852705217,
>  > nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
>  > ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
>  > nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true],
>  >
>
> masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
>  > prevVer=null, nextVer=null], GridCacheMvccCandidate
>  > [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
>  > [topVer=247332659, order=1635852705218, nodeOrder=1], threadId=115,
>  > id=2106, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
>  > reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > otherVer=GridCacheVersion [topVer=247332659, order=1635852705218,
>  > nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
>  > ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
>  > nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true],
>  >
>
> masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
>  > prevVer=null, nextVer=null], GridCacheMvccCandidate
>  > [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
>  > [topVer=247332659, order=1635852705222, nodeOrder=1], threadId=95,
>  > id=2110, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
>  > reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > otherVer=GridCacheVersion [topVer=247332659, order=1635852705222,
>  > nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
>  > ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
>  > nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true],
>  >
>
> masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_local=1|near_local=0|removed=0|read=0,
>  > prevVer=null, nextVer=null], GridCacheMvccCandidate
>  > [nodeId=2735bef0-7404-41e3-843f-7043490c9d84, ver=GridCacheVersion
>  > [topVer=247332659, order=1635852705223, nodeOrder=1], threadId=120,
>  > id=2112, topVer=AffinityTopologyVersion [topVer=1, minorTopVer=162],
>  > reentry=null, otherNodeId=2735bef0-7404-41e3-843f-7043490c9d84,
>  > otherVer=GridCacheVersion [topVer=247332659, order=1635852705223,
>  > nodeOrder=1], mappedDhtNodes=null, mappedNearNodes=null,
>  > ownerVer=GridCacheVersion [topVer=247332659, order=1635852705211,
>  > nodeOrder=1], serOrder=null, key=KeyCacheObjectImpl [part=244,
>  > val=data=6ff0c60ec71-625345be-9a91-497a-895e-abbe5df9da3d],
>  > hasValBytes=true],
>  >
>
> masks=local=1|owner=0|ready=1|reentry=0|used=0|tx=1|single_implicit=0|dht_loca
> [message truncated...]
>


-- 

Sumit Deshinge

R&D Engineer | Symantec Enterprise Division

Broadcom Software

Email: Sumit Deshinge <sumit.deshi...@broadcom.com>

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

Reply via email to