Hi guys! I checked the PR and I think we should improve it a little. With this change `channelsInit` starts in every user thread in case of changed topology. This method invokes `initChannelHolders` that is synchronized.
Before this change: 1. In case of channel failure the user thread tries to start operation in the default channel in case of channel failure. After this change: 1. User thread just hangs and then does useless work (as the channels preparation will be done in another thread with #onTopologyChanged). 2. After this work it still uses the old logic and uses the default channel instead. Even after successful update of channels. My proposal: provide more restrictions to start `channelsInit` in `onChannelFailure`. I think we should: 1. check that addresses were completely changed and don't have any intersections with the known addresses. 2. check that this process wasn't concurrently started, see `scheduledChannelsReinit`. WDYT? On Tue, Jun 28, 2022 at 3:04 PM Pavel Tupitsyn <ptupit...@apache.org> wrote: > The fix has been merged. > Thanks for the contribution, wkhapy123! > > On Fri, Jun 24, 2022 at 8:57 AM Pavel Tupitsyn <ptupit...@apache.org> > wrote: > >> Please request contributor permissions on d...@ignite.apache.org >> >> On Fri, Jun 24, 2022 at 6:53 AM wkhapy...@gmail.com <wkhapy...@gmail.com> >> wrote: >> >>> Hi, >>> >>> I want to contribute to Apache ignite . >>> >>> Would you please give me the contributor permission? >>> below is my account >>> ------------------------------ >>> wkhapy...@gmail.com >>> >>> >>> *From:* wkhapy...@gmail.com >>> *Date:* 2022-06-23 22:34 >>> *To:* Pavel Tupitsyn <ptupit...@apache.org> >>> *Subject:* Re: ignite client can not reconnect to ignite Kubernetes >>> cluster,after pod restart >>> thank you!i will follow wiki. >>> ---Original--- >>> *From:* "Pavel Tupitsyn"<ptupit...@apache.org> >>> *Date:* Thu, Jun 23, 2022 22:25 PM >>> *To:* "wkhapy123"<wkhapy...@gmail.com>;"user"<user@ignite.apache.org>; >>> *Subject:* Re: ignite client can not reconnect to ignite Kubernetes >>> cluster,after pod restart >>> >>> > I want to fix this bug . I think it is good opportunity to study >>> ignite. >>> >>> Great! Please go ahead. >>> Make sure to check our wiki, register as a contributor, and assign the >>> ticket to yourself. >>> >>> >>> https://cwiki.apache.org/confluence/plugins/servlet/mobile?contentId=177047163#content/view/177047163 >>> >>> >>> On Thu, Jun 23, 2022, 17:02 wkhapy...@gmail.com <wkhapy...@gmail.com> >>> wrote: >>> >>>> >>>> I want to fix this bug . I think it is good opportunity to study ignite. >>>> ---Original--- >>>> *From:* "wkhapy...@gmail.com"<wkhapy...@gmail.com> >>>> *Date:* Thu, Jun 23, 2022 21:41 PM >>>> *To:* "Pavel Tupitsyn"<ptupit...@apache.org>; >>>> *Subject:* Re: ignite client can not reconnect to ignite Kubernetes >>>> cluster,after pod restart >>>> >>>> I am interested in ignite >>>> >>>> ---Original--- >>>> *From:* "wkhapy...@gmail.com"<wkhapy...@gmail.com> >>>> *Date:* Thu, Jun 23, 2022 21:37 PM >>>> *To:* "Pavel Tupitsyn"<ptupit...@apache.org>; >>>> *Subject:* Re: ignite client can not reconnect to ignite Kubernetes >>>> cluster,after pod restart >>>> >>>> can I repair it >>>> >>>> ---Original--- >>>> *From:* "Pavel Tupitsyn"<ptupit...@apache.org> >>>> *Date:* Thu, Jun 23, 2022 20:17 PM >>>> *To:* "user"<user@ignite.apache.org>; >>>> *Subject:* Re: Re: ignite client can not reconnect to ignite >>>> Kubernetes cluster,after pod restart >>>> >>>> It is a bug - addresses are not reloaded from AddressFinder on >>>> connection loss, so we still try old pod address and fail: >>>> https://issues.apache.org/jira/browse/IGNITE-17217 >>>> >>>> Thanks for reporting this. >>>> >>>> On Thu, Jun 23, 2022 at 3:00 PM wkhapy...@gmail.com < >>>> wkhapy...@gmail.com> wrote: >>>> >>>>> as you can see,address is 104 >>>>> but addressFinder.getAddress new ip is 87,and retrylimit is 5 (i set) >>>>> ------------------------------ >>>>> wkhapy...@gmail.com >>>>> >>>>> >>>>> *From:* wkhapy...@gmail.com >>>>> *Date:* 2022-06-23 16:02 >>>>> *To:* Maksim Timonin <timoninma...@apache.org> >>>>> *Subject:* Re: Re: ignite client can not reconnect to ignite >>>>> Kubernetes cluster,after pod restart >>>>> sorry i did not add,i will add and retry. >>>>> >>>>> ------------------------------ >>>>> wkhapy...@gmail.com >>>>> >>>>> >>>>> *From:* Maksim Timonin <timoninma...@apache.org> >>>>> *Date:* 2022-06-23 15:55 >>>>> *To:* user <user@ignite.apache.org>; wkhapy123 <wkhapy...@gmail.com> >>>>> *Subject:* Re: Re: ignite client can not reconnect to ignite >>>>> Kubernetes cluster,after pod restart >>>>> Did you set any value to `ClientConfiguration#setRetryLimit`? If you >>>>> check it with a single pod then any value greater than 1 should help (2 or >>>>> 3). >>>>> >>>>> Could you please confirm that you have the failure even with this >>>>> setting? >>>>> >>>>> On Thu, Jun 23, 2022 at 10:49 AM wkhapy...@gmail.com < >>>>> wkhapy...@gmail.com> wrote: >>>>> >>>>>> Hi: >>>>>> i find it still get Connection timed out exeception. >>>>>> and i add cfg.setPartitionAwarenessEnabled(true). >>>>>> and its errorMsg >>>>>> org.apache.ignite.client.ClientConnectionException: Connection timed >>>>>> out >>>>>> at >>>>>> org.apache.ignite.internal.client.thin.io.gridnioserver.GridNioClientConnectionMultiplexer.open(GridNioClientConnectionMultiplexer.java:144) >>>>>> at >>>>>> org.apache.ignite.internal.client.thin.TcpClientChannel.<init>(TcpClientChannel.java:178) >>>>>> at >>>>>> org.apache.ignite.internal.client.thin.ReliableChannel$ClientChannelHolder.getOrCreateChannel(ReliableChannel.java:917) >>>>>> at >>>>>> org.apache.ignite.internal.client.thin.ReliableChannel$ClientChannelHolder.getOrCreateChannel(ReliableChannel.java:898) >>>>>> at >>>>>> org.apache.ignite.internal.client.thin.ReliableChannel$ClientChannelHolder.access$200(ReliableChannel.java:847) >>>>>> at >>>>>> org.apache.ignite.internal.client.thin.ReliableChannel.applyOnDefaultChannel(ReliableChannel.java:759) >>>>>> at >>>>>> org.apache.ignite.internal.client.thin.ReliableChannel.applyOnDefaultChannel(ReliableChannel.java:731) >>>>>> at >>>>>> org.apache.ignite.internal.client.thin.ReliableChannel.service(ReliableChannel.java:167) >>>>>> at >>>>>> org.apache.ignite.internal.client.thin.ReliableChannel.request(ReliableChannel.java:288) >>>>>> at >>>>>> org.apache.ignite.internal.client.thin.TcpIgniteClient.getOrCreateCache(TcpIgniteClient.java:185) >>>>>> at >>>>>> io.naza.vest.dao.impl.IgniteDAOImpl.getCache(IgniteDAOImpl.java:204) >>>>>> >>>>>> and i remote debug client in k8s >>>>>> >>>>>> class GridNioClientConnectionMultiplexer >>>>>> >>>>>> address is 81 >>>>>> but after ignite restart address is 104. >>>>>> so i think address not refresh automatic. >>>>>> and address only get in >>>>>> ReliableChannel.class >>>>>> initChannelHolders method >>>>>> and address refresh in >>>>>> i think >>>>>> this place also need refresh >>>>>> >>>>>> >>>>>> ------------------------------ >>>>>> wkhapy...@gmail.com >>>>>> >>>>>> >>>>>> *From:* Maksim Timonin <timoninma...@apache.org> >>>>>> *Date:* 2022-06-23 13:53 >>>>>> *To:* user <user@ignite.apache.org> >>>>>> *Subject:* Re: ignite client can not reconnect to ignite Kubernetes >>>>>> cluster,after pod restart >>>>>> Hi, >>>>>> >>>>>> Please, try to use `ClientConfiguration#setRetryLimit` additionally >>>>>> to `ClientRetryAllPolicy`. It should help you. Please let me know if it >>>>>> solves the issue or not. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> >>>>>> On Wed, Jun 22, 2022 at 8:02 AM Ilya Korol <llivezk...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Please take look to >>>>>>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/client/ClientAddressFinder.html, >>>>>>> >>>>>>> according to this ThinClientKubernetesAddressFinder should refresh >>>>>>> address list on client connection failure, or you can try to set >>>>>>> *paritionAwareness >>>>>>> = true* in *ClientConfiguration*, that should force ip finder to >>>>>>> proactively refresh address list. >>>>>>> >>>>>>> On 2022/06/22 01:53:38 f cad wrote: >>>>>>> > below if client code config >>>>>>> > KubernetesConnectionConfiguration kcfg = new >>>>>>> > KubernetesConnectionConfiguration(); >>>>>>> > >>>>>>> > >>>>>>> kcfg.setNamespace(igniteK8sNameSpace);kcfg.setServiceName(igniteK8sServiceName);cfg.setAddressesFinder(new >>>>>>> > ThinClientKubernetesAddressFinder(kcfg));cfg.setRetryPolicy(new >>>>>>> > ClientRetryAllPolicy()); >>>>>>> > >>>>>>> > >>>>>>> > after ignite pod restart >>>>>>> > >>>>>>> > client throw >>>>>>> Exceptionorg.apache.ignite.client.ClientConnectionException: >>>>>>> > Connection timed out >>>>>>> > at >>>>>>> org.apache.ignite.internal.client.thin.io.gridnioserver.GridNioClientConnectionMultiplexer.open(GridNioClientConnectionMultiplexer.java:144) >>>>>>> > at >>>>>>> org.apache.ignite.internal.client.thin.TcpClientChannel.<init>(TcpClientChannel.java:178) >>>>>>> > at >>>>>>> org.apache.ignite.internal.client.thin.ReliableChannel$ClientChannelHolder.getOrCreateChannel(ReliableChannel.java:917) >>>>>>> > at >>>>>>> org.apache.ignite.internal.client.thin.ReliableChannel$ClientChannelHolder.getOrCreateChannel(ReliableChannel.java:898) >>>>>>> > at >>>>>>> org.apache.ignite.internal.client.thin.ReliableChannel$ClientChannelHolder.access$200(ReliableChannel.java:847) >>>>>>> > at >>>>>>> org.apache.ignite.internal.client.thin.ReliableChannel.applyOnDefaultChannel(ReliableChannel.java:759) >>>>>>> > at >>>>>>> org.apache.ignite.internal.client.thin.ReliableChannel.applyOnDefaultChannel(ReliableChannel.java:731) >>>>>>> > at >>>>>>> org.apache.ignite.internal.client.thin.ReliableChannel.service(ReliableChannel.java:167) >>>>>>> > at >>>>>>> org.apache.ignite.internal.client.thin.ReliableChannel.request(ReliableChannel.java:288) >>>>>>> > at >>>>>>> org.apache.ignite.internal.client.thin.TcpIgniteClient.getOrCreateCache(TcpIgniteClient.java:185) >>>>>>> > >>>>>>> > and i use retry to reconnect and print >>>>>>> > clientConfiguration.getAddressesFinder().getAddresses() and it >>>>>>> address is >>>>>>> > pod address,but client not reconnect >>>>>>> > >>>>>>> > while (retryTimeTmp < retryTimes) { >>>>>>> > try { >>>>>>> > return igniteClient.getOrCreateCache(new >>>>>>> > ClientCacheConfiguration() >>>>>>> > .setName(cacheName) >>>>>>> > .setAtomicityMode(TRANSACTIONAL) >>>>>>> > .setCacheMode(PARTITIONED) >>>>>>> > .setBackups(2) >>>>>>> > .setWriteSynchronizationMode(PRIMARY_SYNC)); >>>>>>> > }catch (Exception e) { >>>>>>> > LOGGER.error("get cache [{}] not success", cacheName, e); >>>>>>> > LOGGER.error("get address info [{}], ipfinder [{}]", >>>>>>> > clientConfiguration.getAddresses(), >>>>>>> > clientConfiguration.getAddressesFinder().getAddresses()); >>>>>>> > >>>>>>> > retrySleep(); >>>>>>> > } finally { >>>>>>> > retryTimeTmp++; >>>>>>> > } >>>>>>> > >>>>>>> >>>>>>