Re: rejecting communication connection & Failed to process selector key
Yes - kubernetes discovery spi. No client nodes are out of the cluster. and below setting for communication spi. config of communicationSpi
Re: rejecting communication connection & Failed to process selector key
Are you using the kubernetes discovery SPI?Humphrey On 16 Sep 2024, at 02:58, MJ <6733...@qq.com> wrote:Hi Igniters, I am experiencing the “Failed to process selector key” error once every one or two days. Every time it appears received and rejected multiple communication connections and then threw the exception. Below logging is about “Broken pipe” original exception but not only “Broken pipe”, occasionally the “Failed to process selector key” wraps “Connection Reset”, “javax.net.ssl.SSLException: Failed to encrypt data (SSL engine error) [status=CLOSED, handshakeStatus=NOT_HANDSHAKING”. Is there any solution to fix it ? or its configuration can be improved ? Ignite 2.16.0 / 4 data nodes, running in openshift 4 config of communicationSpi 24-09-15 17:18:35.146 [INFO ] grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi% o.a.i.s.c.t.TcpCommunicationSpi:117 - Accepted incoming communication connection [locAddr=/10.254.32.162:47100, rmtAddr=/10.254.13.83:35160] 24-09-15 17:18:35.147 [INFO ] grid-nio-worker-tcp-comm-2-#25%TcpCommunicationSpi% o.a.i.s.c.t.TcpCommunicationSpi:117 - Received incoming connection when already connected to this node, rejecting [locNode=52437bc3-3dfe-4f76-bec6-d2f22f8a5d40, rmtNode=7c28b6bc-8991-47a2-b69c-6adba0482713] 24-09-15 17:18:35.357 [INFO ] grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi% o.a.i.s.c.t.TcpCommunicationSpi:117 - Accepted incoming communication connection [locAddr=/10.254.32.162:47100, rmtAddr=/10.254.13.83:35162] 24-09-15 17:18:35.358 [INFO ] grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi% o.a.i.s.c.t.TcpCommunicationSpi:117 - Received incoming connection when already connected to this node, rejecting [locNode=52437bc3-3dfe-4f76-bec6-d2f22f8a5d40, rmtNode=7c28b6bc-8991-47a2-b69c-6adba0482713] 24-09-15 17:18:35.568 [INFO ] grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi% o.a.i.s.c.t.TcpCommunicationSpi:117 - Accepted incoming communication connection [locAddr=/10.254.32.162:47100, rmtAddr=/10.254.13.83:35164] 24-09-15 17:18:35.569 [INFO ] grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi% o.a.i.s.c.t.TcpCommunicationSpi:117 - Received incoming connection when already connected to this node, rejecting [locNode=52437bc3-3dfe-4f76-bec6-d2f22f8a5d40, rmtNode=7c28b6bc-8991-47a2-b69c-6adba0482713] 24-09-15 17:18:35.975 [ERROR] grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi% o.a.i.s.c.t.TcpCommunicationSpi:137 - Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=1, bytesRcvd=29406013584, bytesSent=0, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-1, igniteInstanceName=TcpCommunicationSpi, finished=false, heartbeatTs=1726435114873, hashCode=1144648384, interrupted=false, runner=grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=20129536, resendCnt=0, rcvCnt=19533551, sentCnt=20129879, reserved=true, lastAck=19533551, nodeLeft=false, node=TcpDiscoveryNode [id=7c28b6bc-8991-47a2-b69c-6adba0482713, consistentId=10.254.13.83,127.0.0.1:47500, addrs=ArrayList [10.254.13.83, 127.0.0.1], sockAddrs=HashSet [/10.254.13.83:47500, /127.0.0.1:47500], discPort=47500, order=3, intOrder=3, lastExchangeTime=1724822271382, loc=false, ver=2.16.0#20231215-sha1:7bde6a42, isClient=false], connected=false, connectCnt=205, queueLimit=131072, reserveCnt=260, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=20129536, resendCnt=0, rcvCnt=19533551, sentCnt=20129879, reserved=true, lastAck=19533551, nodeLeft=false, node=TcpDiscoveryNode [id=7c28b6bc-8991-47a2-b69c-6adba0482713, consistentId=10.254.13.83,127.0.0.1:47500, addrs=ArrayList [10.254.13.83, 127.0.0.1], sockAddrs=HashSet [/10.254.13.83:47500, /127.0.0.1:47500], discPort=47500, order=3, intOrder=3, lastExchangeTime=1724822271382, loc=false, ver=2.16.0#20231215-sha1:7bde6a42, isClient=false], connected=false, connectCnt=205, queueLimit=131072, reserveCnt=260, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric@69a257d1, super=GridNioSessionImpl [locAddr=/10.254.32.162:52542, rmtAddr=/10.254.13.83:47100, createTime=1726435114863, closeTime=0, bytesSent=164200, bytesRcvd=468, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1726435114863, lastSndTime=1726435114972, lastRcvTime=1726435114972, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser@5196c6f7, directMode=true], GridConnectionBytesVerifyFilter, SSL filter], accepted=false, markedForClose=true]]] java.io.IOException: Broken pipe
Re: How to use ClusterNodeAttributeAffinityBackupFilter to have atleast one replica in second zone
Hello Amit, You can use ClusterNodeAttributeAffinityBackupFilter and introduce some virtual zones. For example, if you have 5 nodes in zone 1 and 5 nodes in zone 2, you can assign 'zone1' attribute value to 3 nodes from zone 1, assign 'zone2' attribute value to 3 nodes from zone 2, and assign 'zone3' attribute value to 4 remaining nodes (2 from zone 1 and 2 from zone 2). In this case, there will be 3 copies of each partition, each partition will be in each virtual zone, but nodes in virtual 'zone3' will contain a little bit less partitions than nodes from 'zone1' and 'zone2'. Or you can create your own backup filter, allowing no more than two nodes for the same attribute value, for example like this: public class MyAffinityBackupFilter implements IgniteBiPredicate> { private final String attrName; public MyAffinityBackupFilter(String attrName) { this.attrName = attrName; } @Override public boolean apply(ClusterNode candidate, List previouslySelected) { Set usedAttrs = new HashSet<>(); for (ClusterNode node : previouslySelected) { if (Objects.equals(candidate.attribute(attrName), node.attribute(attrName)) && !usedAttrs.add(candidate.attribute(attrName))) return false; } return true; } } In this case you can achieve a more even distribution. чт, 19 сент. 2024 г. в 16:58, Amit Jolly : > > Hi Pavel, > > Well based upon documentation of > ClusterNodeAttributeAffinityBackupFilter.java class. It says "This > implementation will discard backups rather than place multiple on the same > set of nodes. This avoids trying to cram more data onto remaining nodes when > some have failed." and i have verified the same by running a small test with > three node cluster (one assigned with node attribute as > AVAILABILITY_ZONE=ZONE1 and other two assigned node attribute > AVAILABILITY_ZONE=ZONE2) , Created a cache with 2 backups and using > ClusterNodeAttributeAffinityBackupFilter in RendezvousAffinityFunction as > below. After that added an entry into the cache and verified the nodes count > for both primary and backup using cache affinity function. It returned 2 > instead of 3. > > ClusterNodeAttributeAffinityBackupFilter backupFilter = new > ClusterNodeAttributeAffinityBackupFilter("AVAILABILITY_ZONE"); > RendezvousAffinityFunction rendezvousAffinityFunction = new > RendezvousAffinityFunction(); > rendezvousAffinityFunction.setAffinityBackupFilter(backupFilter); > > CacheConfiguration cacheConfiguration = new > CacheConfiguration<>(); > cacheConfiguration.setBackups(2); > cacheConfiguration.setAffinity(rendezvousAffinityFunction); > > IgniteCache cache = > ignite.getOrCreateCache(cacheConfiguration); > cache.put("1","1"); > Collection nodes = > ((Ignite)cache.unwrap(Ignite.class)).affinity(cache.getName()).mapKeyToPrimaryAndBackups("1"); > assertEquals(3, nodes.size()); //This fails even though i have three nodes (1 > with node attribute AVAILABILITY_ZONE="ZONE1" and other two with node > attribute AVAILABILITY_ZONE="ZONE2") > > PS: I started three nodes with Custom cache configuration > IgniteConfiguration.setUserAttributes > > Server Node1 > = > Map userAttributes = new HashMap<>(); > userAttributes.put("AVAILABILITY_ZONE", "ZONE1"); > IgniteConfiguration cfg = new IgniteConfiguration(); > cfg.setUserAttributes(userAttributes); > Ignition.start(cfg); > > Server Node2 > = > Map userAttributes = new HashMap<>(); > userAttributes.put("AVAILABILITY_ZONE", "ZONE2"); > IgniteConfiguration cfg = new IgniteConfiguration(); > cfg.setUserAttributes(userAttributes); > Ignition.start(cfg); > > Server Node3 > = > Map userAttributes = new HashMap<>(); > userAttributes.put("AVAILABILITY_ZONE", "ZONE2"); > IgniteConfiguration cfg = new IgniteConfiguration(); > cfg.setUserAttributes(userAttributes); > Ignition.start(cfg); > > > https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/affinity/rendezvous/ClusterNodeAttributeAffinityBackupFilter.html > > Thanks, > > Amit Jolly > > On Thu, Sep 19, 2024 at 12:51 AM Pavel Tupitsyn wrote: >> >> Hi Amit, >> >> > if the backup count is let's say 2, Ignite won't create a second backup >> > as there are not enough zones >> Not correct - Ignite will create backups anyway. >> - A backup is a copy of a partition on another node >> - With 2 backups every partition will have 3 copies (1 primary, 2 backup), >> all on different nodes (since you have 10 nodes) >> - Use ClusterNodeAttributeAffinityBackupFilter to ensure that at least one >> of the copies is in a different AZ >> >> And that is enough for 3 copies. >> >> On Thu, Sep 19, 2024 at 12:10 AM Amit Jolly wrote: >>> >>> Hi Team >>> >>> We are planning to run 10 node Ignite clusters in AWS with 5 nodes each >>> into two availability zones. Using Kubernetes topologyspreadconstraints we >>> have made sure th
Re: How to use ClusterNodeAttributeAffinityBackupFilter to have atleast one replica in second zone
Hello Alex, Thanks for the suggestion. I will try with both options. Most likely I will use the custom AffinityBackupFilter implementation you provided and will try to enhance zone skewness configurable. Regards, Amit On Thu, Sep 19, 2024 at 3:16 PM Alex Plehanov wrote: > Hello Amit, > > You can use ClusterNodeAttributeAffinityBackupFilter and introduce > some virtual zones. For example, if you have 5 nodes in zone 1 and 5 > nodes in zone 2, you can assign 'zone1' attribute value to 3 nodes > from zone 1, assign 'zone2' attribute value to 3 nodes from zone 2, > and assign 'zone3' attribute value to 4 remaining nodes (2 from zone 1 > and 2 from zone 2). In this case, there will be 3 copies of each > partition, each partition will be in each virtual zone, but nodes in > virtual 'zone3' will contain a little bit less partitions than nodes > from 'zone1' and 'zone2'. > > Or you can create your own backup filter, allowing no more than two > nodes for the same attribute value, for example like this: > > public class MyAffinityBackupFilter implements > IgniteBiPredicate> { > private final String attrName; > > public MyAffinityBackupFilter(String attrName) { > this.attrName = attrName; > } > > @Override public boolean apply(ClusterNode candidate, > List previouslySelected) { > Set usedAttrs = new HashSet<>(); > > for (ClusterNode node : previouslySelected) { > if (Objects.equals(candidate.attribute(attrName), > node.attribute(attrName)) && > !usedAttrs.add(candidate.attribute(attrName))) > return false; > } > > return true; > } > } > > In this case you can achieve a more even distribution. > > чт, 19 сент. 2024 г. в 16:58, Amit Jolly : > > > > Hi Pavel, > > > > Well based upon documentation of > ClusterNodeAttributeAffinityBackupFilter.java class. It says "This > implementation will discard backups rather than place multiple on the same > set of nodes. This avoids trying to cram more data onto remaining nodes > when some have failed." and i have verified the same by running a small > test with three node cluster (one assigned with node attribute as > AVAILABILITY_ZONE=ZONE1 and other two assigned node attribute > AVAILABILITY_ZONE=ZONE2) , Created a cache with 2 backups and using > ClusterNodeAttributeAffinityBackupFilter in RendezvousAffinityFunction as > below. After that added an entry into the cache and verified the nodes > count for both primary and backup using cache affinity function. It > returned 2 instead of 3. > > > > ClusterNodeAttributeAffinityBackupFilter backupFilter = new > ClusterNodeAttributeAffinityBackupFilter("AVAILABILITY_ZONE"); > > RendezvousAffinityFunction rendezvousAffinityFunction = new > RendezvousAffinityFunction(); > > rendezvousAffinityFunction.setAffinityBackupFilter(backupFilter); > > > > CacheConfiguration cacheConfiguration = new > CacheConfiguration<>(); > > cacheConfiguration.setBackups(2); > > cacheConfiguration.setAffinity(rendezvousAffinityFunction); > > > > IgniteCache cache = > ignite.getOrCreateCache(cacheConfiguration); > > cache.put("1","1"); > > Collection nodes = > ((Ignite)cache.unwrap(Ignite.class)).affinity(cache.getName()).mapKeyToPrimaryAndBackups("1"); > > assertEquals(3, nodes.size()); //This fails even though i have three > nodes (1 with node attribute AVAILABILITY_ZONE="ZONE1" and other two with > node attribute AVAILABILITY_ZONE="ZONE2") > > > > PS: I started three nodes with Custom cache configuration > IgniteConfiguration.setUserAttributes > > > > Server Node1 > > = > > Map userAttributes = new HashMap<>(); > > userAttributes.put("AVAILABILITY_ZONE", "ZONE1"); > > IgniteConfiguration cfg = new IgniteConfiguration(); > > cfg.setUserAttributes(userAttributes); > > Ignition.start(cfg); > > > > Server Node2 > > = > > Map userAttributes = new HashMap<>(); > > userAttributes.put("AVAILABILITY_ZONE", "ZONE2"); > > IgniteConfiguration cfg = new IgniteConfiguration(); > > cfg.setUserAttributes(userAttributes); > > Ignition.start(cfg); > > > > Server Node3 > > = > > Map userAttributes = new HashMap<>(); > > userAttributes.put("AVAILABILITY_ZONE", "ZONE2"); > > IgniteConfiguration cfg = new IgniteConfiguration(); > > cfg.setUserAttributes(userAttributes); > > Ignition.start(cfg); > > > > > > > https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/affinity/rendezvous/ClusterNodeAttributeAffinityBackupFilter.html > > > > Thanks, > > > > Amit Jolly > > > > On Thu, Sep 19, 2024 at 12:51 AM Pavel Tupitsyn > wrote: > >> > >> Hi Amit, > >> > >> > if the backup count is let's say 2, Ignite won't create a second > backup as there are not enough zones > >> Not correct - Ignite will create backups anyway. > >> - A backup is a copy of a partition on another node > >> - With 2 backups every partition will have 3 copies (1 primary, 2 > backup), all on differe
Re: How to use ClusterNodeAttributeAffinityBackupFilter to have atleast one replica in second zone
Hi Pavel, Well based upon documentation of ClusterNodeAttributeAffinityBackupFilter.java class. It says "This implementation will discard backups rather than place multiple on the same set of nodes. This avoids trying to cram more data onto remaining nodes when some have failed." and i have verified the same by running a small test with three node cluster (one assigned with node attribute as AVAILABILITY_ZONE=ZONE1 and other two assigned node attribute AVAILABILITY_ZONE=ZONE2) , Created a cache with 2 backups and using ClusterNodeAttributeAffinityBackupFilter in RendezvousAffinityFunction as below. After that added an entry into the cache and verified the nodes count for both primary and backup using cache affinity function. It returned 2 instead of 3. ClusterNodeAttributeAffinityBackupFilter backupFilter = new ClusterNodeAttributeAffinityBackupFilter("AVAILABILITY_ZONE"); RendezvousAffinityFunction rendezvousAffinityFunction = new RendezvousAffinityFunction(); rendezvousAffinityFunction.setAffinityBackupFilter(backupFilter); CacheConfiguration cacheConfiguration = new CacheConfiguration<>(); cacheConfiguration.setBackups(2); cacheConfiguration.setAffinity(rendezvousAffinityFunction); IgniteCache cache = ignite.getOrCreateCache(cacheConfiguration); cache.put("1","1"); Collection nodes = ((Ignite)cache.unwrap(Ignite.class)).affinity(cache.getName()).mapKeyToPrimaryAndBackups("1"); assertEquals(3, nodes.size()); //This fails even though i have three nodes (1 with node attribute AVAILABILITY_ZONE="ZONE1" and other two with node attribute AVAILABILITY_ZONE="ZONE2") PS: I started three nodes with Custom cache configuration IgniteConfiguration.setUserAttributes Server Node1 = Map userAttributes = new HashMap<>(); userAttributes.put("AVAILABILITY_ZONE", "ZONE1"); IgniteConfiguration cfg = new IgniteConfiguration(); cfg.setUserAttributes(userAttributes); Ignition.start(cfg); Server Node2 = Map userAttributes = new HashMap<>(); userAttributes.put("AVAILABILITY_ZONE", "ZONE2"); IgniteConfiguration cfg = new IgniteConfiguration(); cfg.setUserAttributes(userAttributes); Ignition.start(cfg); Server Node3 = Map userAttributes = new HashMap<>(); userAttributes.put("AVAILABILITY_ZONE", "ZONE2"); IgniteConfiguration cfg = new IgniteConfiguration(); cfg.setUserAttributes(userAttributes); Ignition.start(cfg); https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/affinity/rendezvous/ClusterNodeAttributeAffinityBackupFilter.html Thanks, Amit Jolly On Thu, Sep 19, 2024 at 12:51 AM Pavel Tupitsyn wrote: > Hi Amit, > > > if the backup count is let's say 2, Ignite won't create a second backup > as there are not enough zones > Not correct - Ignite will create backups anyway. > - A backup is a copy of a partition on another node > - With 2 backups every partition will have 3 copies (1 primary, 2 backup), > all on different nodes (since you have 10 nodes) > - Use ClusterNodeAttributeAffinityBackupFilter to ensure that at least one > of the copies is in a different AZ > > And that is enough for 3 copies. > > On Thu, Sep 19, 2024 at 12:10 AM Amit Jolly wrote: > >> Hi Team >> >> We are planning to run 10 node Ignite clusters in AWS with 5 nodes each >> into two availability zones. Using Kubernetes topologyspreadconstraints we >> have made sure that no two Ignite pods are started on the same virtual >> machine/node/host. >> >> I understand with ClusterNodeAttributeAffinityBackupFilter i can force >> ignite to store the backup in a different zone if backup count is 1. >> >> But if the backup count is let's say 2, Ignite won't create a second >> backup as there are not enough zones. >> >> My question is if i have backup count 2, How can I >> use ClusterNodeAttributeAffinityBackupFilter or (custom >> AffinityBackupFilter) to have at least one backup in each zone and another >> backup anywhere else where available. >> >> I think in order to achieve what I am thinking somehow I >> need currentTopologySnapshot >> available in ClusterNodeAttributeAffinityBackupFilter or custom >> AffinityBackupFilter >> >> Thanks, >> >> Amit Jolly >> >