Ok. For now I rebooted all nodes... But it's fairly easy to reproduce. On Wed., Oct. 6, 2021, 2:17 a.m. Zhenya Stanilovsky, <arzamas...@mail.ru> wrote:
> > Ok, seems something goes wrong on node with > id=36edbfd5-4feb-417e-b965-bdc34a0a6f4f If you still have a problem, can u > send here or directly by me these logs ? > > > > > And finally this on the coordinator node.... > > [14:07:41,282][WARNING][exchange-worker-#42%xxxxxx%][GridDhtPartitionsExchangeFuture] > Unable to await partitions release latch within timeout. Some nodes have > not sent acknowledgement for latch completion. It's possible due to > unfinishined atomic updates, transactions or not released explicit locks on > that nodes. Please check logs for errors on nodes with ids reported in > latch `pendingAcks` collection [latch=ServerLatch [permits=1, > pendingAcks=HashSet [36edbfd5-4feb-417e-b965-bdc34a0a6f4f], > super=CompletableLatch [id=CompletableLatchUid [id=exchange, > topVer=AffinityTopologyVersion [topVer=103, minorTopVer=0]]]]] > > On Tue, 5 Oct 2021 at 10:07, John Smith <java.dev....@gmail.com > <//e.mail.ru/compose/?mailto=mailto%3ajava.dev....@gmail.com>> wrote: > > And I see this... > > [14:04:15,150][WARNING][exchange-worker-#43%raange%][GridDhtPartitionsExchangeFuture] > Unable to await partitions release latch within timeout. For more details > please check coordinator node logs [crdNode=TcpDiscoveryNode > [id=36ad785d-e344-43bb-b685-e79557572b54, > consistentId=8172e45d-3ff8-4fe4-aeda-e7d30c1e11e2, addrs=ArrayList > [127.0.0.1, xxxxxx.65], sockAddrs=HashSet [xxxxxx-0002/xxxxxx.65:47500, / > 127.0.0.1:47500], discPort=47500, order=1, intOrder=1, > lastExchangeTime=1633370987399, loc=false, > ver=2.8.1#20200521-sha1:86422096, isClient=false]] [latch=ClientLatch > [coordinator=TcpDiscoveryNode [id=36ad785d-e344-43bb-b685-e79557572b54, > consistentId=8172e45d-3ff8-4fe4-aeda-e7d30c1e11e2, addrs=ArrayList > [127.0.0.1, xxxxxx.65], sockAddrs=HashSet [xxxxxx-0002/xxxxxx.65:47500, / > 127.0.0.1:47500], discPort=47500, order=1, intOrder=1, > lastExchangeTime=1633370987399, loc=false, > ver=2.8.1#20200521-sha1:86422096, isClient=false], ackSent=true, > super=CompletableLatch [id=CompletableLatchUid [id=exchange, > topVer=AffinityTopologyVersion [topVer=103, minorTopVer=0]]]]] > > On Tue, 5 Oct 2021 at 10:02, John Smith <java.dev....@gmail.com > <//e.mail.ru/compose/?mailto=mailto%3ajava.dev....@gmail.com>> wrote: > > Actually to be more clear... > > http://xxxxxx-0001:8080/ignite?cmd=version responds immediately. > > http://xxxxxx-0001:8080/ignite?cmd=size&cacheName=my-cache doesn't > respond at all. > > On Tue, 5 Oct 2021 at 09:59, John Smith <java.dev....@gmail.com > <//e.mail.ru/compose/?mailto=mailto%3ajava.dev....@gmail.com>> wrote: > > Yeah ever since I got this erro for example the REST APi wont return and > the request are slower. But when I connect with visor I can get stats I can > scan the cache etc... > > Is it possible that these async futures/threads are not released? > > On Tue, 5 Oct 2021 at 04:11, Zhenya Stanilovsky <arzamas...@mail.ru > <//e.mail.ru/compose/?mailto=mailto%3aarzamas...@mail.ru>> wrote: > > Hi, this is just a warning shows that something suspicious observed. > There is no simple reply for your question, in common case all these > messages are due to cluster (resources or settings) limitation. > Check documentation for tuning performance [1] > > [1] > https://ignite.apache.org/docs/latest/perf-and-troubleshooting/general-perf-tips > > > > Hi, using 2.8.1 I understand the message as in my async TRX is taking > longer but is there a way to prevent it? > > When this happened I was pushing about 50, 000 get/puts per second from my > API. > > > > > > > > > > >