Ok, additionally there is info about node segmentation:
Node FAILED: TcpDiscoveryNode [id=fb67a5fd-f1ab-441d-a38e-bab975cd1037, 
consistentId=0:0:0:0:0:0:0:1%lo,XX.XX.XX.XX,127.0.0.1:47500, addrs=ArrayList 
[0:0:0:0:0:0:0:1%lo, XX.XX.XX.XX, 127.0.0.1], sockAddrs=HashSet 
[qagmscore02.xyz.com/XX.XX.XX.XX:47500, /0:0:0:0:0:0:0:1%lo:47500, 
/127.0.0.1:47500], discPort=47500, order=25, intOrder=16, 
lastExchangeTime=1633426750418, loc=false, ver=2.10.0#20210310-sha1:bc24f6ba, 
isClient=false]
 
Local node SEGMENTED: TcpDiscoveryNode [id=7f357ca2-0ae2-4af0-bfa4-d18e7bcb3797
 
Possible too long JVM pause: 1052 milliseconds.

Are you changed default settings networking timeouts ? If no — try to recheck 
setting of failureDetectionTimeout
If you have GC pause longer than 10 seconds, node will be dropped from the 
cluster(by default). 
 
>This is the codebase of AgmsCacheJdbcStoreSessionListner.java
>This null pointer occurs due to a datasource bean not found. 
>The cluster was working fine but what could be the reason for unavailability 
>of datasource bean in between running cluster. 
> 
> 
>public class AgmsCacheJdbcStoreSessionListener extends 
>CacheJdbcStoreSessionListener {
>
>
>  @SpringApplicationContextResource
>  public void setupDataSourceFromSpringContext(Object appCtx) {
>    ApplicationContext appContext = (ApplicationContext) appCtx;
>    setDataSource((DataSource) appContext.getBean("dataSource"));
>  }
>}
> 
>I can see one log line that tells us about a problem on the network side. Is 
>this the possible reason?
> 
>2021-10-07 16:28:22,889 197776202 [tcp-disco-msg-worker-[fb67a5fd 
>XX.XX.XX.XX:47500 crd]-#2%springDataNode%-#69%springDataNode%] WARN  
>o.a.i.s.d.tcp.TcpDiscoverySpi - Node is out of topology (probably, due to 
>short-time network problems).
>   
>On Mon, Oct 11, 2021 at 7:15 PM stanilovsky evgeny < 
>estanilovs...@gridgain.com > wrote:
>>may be this ? 
>> 
>>Caused by: java.lang.NullPointerException: null
>>at 
>>com.xyz.agms.grid.cache.loader.AgmsCacheJdbcStoreSessionListener.setupDataSourceFromSpringContext(AgmsCacheJdbcStoreSessionListener.java:14)
>>... 23 common frames omitted
>> 
>> 
>>>Hi Zhenya,
>>>CacheStoppedException occurred again on our ignite cluster. I have captured 
>>>logs with  IGNITE_QUIET = false.
>>>There are four core nodes in the cluster and two nodes gone down. I am 
>>>attaching the logs for two failed nodes.
>>>Please let me know if you need any further details.
>>> 
>>>Thanks,
>>>Akash   
>>>On Tue, Sep 7, 2021 at 12:19 PM Zhenya Stanilovsky < arzamas...@mail.ru > 
>>>wrote:
>>>>plz share somehow these logs, if you have no ideas how to share, you can 
>>>>send it directly to  arzamas...@mail.ru
>>>>   
>>>>>Meanwhile I grep the logs with the next occurrence of cache stopped 
>>>>>exception,can someone highlight if there is any known bug related to this?
>>>>>I want to check the possible reason for this cache stop exception.  
>>>>>On Mon, Sep 6, 2021 at 6:27 PM Akash Shinde < akashshi...@gmail.com > 
>>>>>wrote:
>>>>>>Hi Zhenya,
>>>>>>Thanks for the quick response.
>>>>>>I believe you are talking about ignite instances. There is single ignite 
>>>>>>using in application.
>>>>>>I also want to point out that I am not using destroyCache()  method 
>>>>>>anywhere in application.
>>>>>> 
>>>>>>I will set   IGNITE_QUIET = false  and try to grep the required logs.
>>>>>>This issue occurs by random and there is no way reproduce it.
>>>>>> 
>>>>>>Thanks,
>>>>>>Akash
>>>>>> 
>>>>>>   
>>>>>>On Mon, Sep 6, 2021 at 5:33 PM Zhenya Stanilovsky < arzamas...@mail.ru > 
>>>>>>wrote:
>>>>>>>Hi, Akash
>>>>>>>You can obtain such a case, for example when you have several instances 
>>>>>>>and :
>>>>>>>inst1:
>>>>>>>cache = inst1.getOrCreateCache("cache1");
>>>>>>> 
>>>>>>>after inst2 destroy calling:
>>>>>>> 
>>>>>>>cache._some_method_call_
>>>>>>> 
>>>>>>>inst2:
>>>>>>> inst2.destroyCache("cache1");
>>>>>>> 
>>>>>>>or shorter: you still use instance that already destroyed, you can 
>>>>>>>simple grep your logs and found the time when cache has been stopped.
>>>>>>>probably you need to set  IGNITE_QUIET = false.
>>>>>>>[1]  https://ignite.apache.org/docs/latest/logging
>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>Hi,
>>>>>>>>>>I have four server nodes and six client nodes on ignite cluster. I am 
>>>>>>>>>>using ignite 2.10 version.
>>>>>>>>>>Some operations are failing due to the CacheStoppedException 
>>>>>>>>>>exception on the server nodes. This has become a blocker issue. 
>>>>>>>>>>Could someone please help me to resolve this issue.
>>>>>>>>>> 
>>>>>>>>>>Cache Configuration
>>>>>>>>>>CacheConfiguration subscriptionCacheCfg = new 
>>>>>>>>>>CacheConfiguration<>(CacheName.SUBSCRIPTION_CACHE.name());
>>>>>>>>>>subscriptionCacheCfg.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
>>>>>>>>>>subscriptionCacheCfg.setWriteThrough(false);
>>>>>>>>>>subscriptionCacheCfg.setReadThrough(true);
>>>>>>>>>>subscriptionCacheCfg.setRebalanceMode(CacheRebalanceMode.ASYNC);
>>>>>>>>>>subscriptionCacheCfg.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
>>>>>>>>>>subscriptionCacheCfg.setBackups(2);
>>>>>>>>>>Factory<SubscriptionDataLoader> storeFactory = 
>>>>>>>>>>FactoryBuilder.factoryOf(SubscriptionDataLoader.class);
>>>>>>>>>>subscriptionCacheCfg.setCacheStoreFactory(storeFactory);
>>>>>>>>>>subscriptionCacheCfg.setIndexedTypes(DefaultDataKey.class, 
>>>>>>>>>>SubscriptionData.class);
>>>>>>>>>>subscriptionCacheCfg.setSqlIndexMaxInlineSize(47);
>>>>>>>>>>RendezvousAffinityFunction affinityFunction = new 
>>>>>>>>>>RendezvousAffinityFunction();
>>>>>>>>>>affinityFunction.setExcludeNeighbors(true);
>>>>>>>>>>subscriptionCacheCfg.setAffinity(affinityFunction);
>>>>>>>>>>subscriptionCacheCfg.setStatisticsEnabled(true);
>>>>>>>>>>subscriptionCacheCfg.setPartitionLossPolicy(PartitionLossPolicy.READ_WRITE_SAFE);
>>>>>>>>>> 
>>>>>>>>>>Exception stack trace
>>>>>>>>>> 
>>>>>>>>>>ERROR c.q.dgms.kafka.TaskRequestListener - Error occurred while 
>>>>>>>>>>consuming the object
>>>>>>>>>>com.baidu.unbiz.fluentvalidator.exception.RuntimeValidateException: 
>>>>>>>>>>java.lang.IllegalStateException: class 
>>>>>>>>>>org.apache.ignite.internal.processors.cache.CacheStoppedException: 
>>>>>>>>>>Failed to perform cache operation (cache is stopped): 
>>>>>>>>>>SUBSCRIPTION_CACHE
>>>>>>>>>>at 
>>>>>>>>>>com.baidu.unbiz.fluentvalidator.FluentValidator.doValidate(FluentValidator.java:506)
>>>>>>>>>>at 
>>>>>>>>>>com.baidu.unbiz.fluentvalidator.FluentValidator.doValidate(FluentValidator.java:461)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.service.UserManagementServiceImpl.deleteUser(UserManagementServiceImpl.java:710)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.kafka.TaskRequestListener.processRequest(TaskRequestListener.java:190)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.kafka.TaskRequestListener.process(TaskRequestListener.java:89)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.libraries.mom.kafka.consumer.TopicConsumer.lambda$run$3(TopicConsumer.java:162)
>>>>>>>>>>at net.jodah.failsafe.Functions$12.call(Functions.java:274)
>>>>>>>>>>at net.jodah.failsafe.SyncFailsafe.call(SyncFailsafe.java:145)
>>>>>>>>>>at net.jodah.failsafe.SyncFailsafe.run(SyncFailsafe.java:93)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.libraries.mom.kafka.consumer.TopicConsumer.run(TopicConsumer.java:159)
>>>>>>>>>>at 
>>>>>>>>>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>>>>>>>>>at 
>>>>>>>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>>>>>>>>>at java.lang.Thread.run(Thread.java:748)
>>>>>>>>>>Caused by: java.lang.IllegalStateException: class 
>>>>>>>>>>org.apache.ignite.internal.processors.cache.CacheStoppedException: 
>>>>>>>>>>Failed to perform cache operation (cache is stopped): 
>>>>>>>>>>SUBSCRIPTION_CACHE
>>>>>>>>>>at 
>>>>>>>>>>org.apache.ignite.internal.processors.cache.GridCacheGateway.enter(GridCacheGateway.java:166)
>>>>>>>>>>at 
>>>>>>>>>>org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.onEnter(GatewayProtectedCacheProxy.java:1625)
>>>>>>>>>>at 
>>>>>>>>>>org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.get(GatewayProtectedCacheProxy.java:673)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.grid.dao.AbstractDataGridDAO.getData(AbstractDataGridDAO.java:39)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.grid.dao.AbstractDataGridDAO.getData(AbstractDataGridDAO.java:28)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.grid.dataservice.DefaultDataGridService.getData(DefaultDataGridService.java:22)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.grid.dataservice.DefaultDataGridService.getData(DefaultDataGridService.java:10)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.validators.common.validators.UserDataValidator.validateSubscription(UserDataValidator.java:226)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.validators.common.validators.UserDataValidator.validateRequest(UserDataValidator.java:124)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.validators.common.validators.UserDataValidator.validate(UserDataValidator.java:346)
>>>>>>>>>>at 
>>>>>>>>>>com.xyz.dgms.validators.common.validators.UserDataValidator.validate(UserDataValidator.java:41)
>>>>>>>>>>at 
>>>>>>>>>>com.baidu.unbiz.fluentvalidator.FluentValidator.doValidate(FluentValidator.java:490)
>>>>>>>>>>... 12 common frames omitted
>>>>>>>>>>Caused by: 
>>>>>>>>>>org.apache.ignite.internal.processors.cache.CacheStoppedException: 
>>>>>>>>>>Failed to perform cache operation (cache is stopped): 
>>>>>>>>>>SUBSCRIPTION_CACHE
>>>>>>>>>>... 24 common frames omitted
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>Thanks,
>>>>>>>>>>Akash
>>>>>>>>>>
>>>>>>>>>>  
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>  
>>>> 
>>>> 
>>>> 
>>>> 
>>
>>  
 
 
 
 

Reply via email to