Re: How to do address resolution?

Stephen Darlington Wed, 01 Jul 2020 08:49:00 -0700

It’s not that Visor connects to a thick client, it’s that it is a thick client. 
There are some weird implementation details — like it’s written in Scala and 
using “daemon mode” — but it becomes part of the cluster, so the same “rules” 
apply as any other thick client. Connections to other nodes are “on demand” so 
the pause it likely to be it trying to open a communicationSPI connection to 
one of the other nodes.


I agree that this is not necessarily intuitive for an administrative tool, but 
it’s what we have until all the functionality can be provided by a thin client 
or using the REST API.

> On 1 Jul 2020, at 16:38, John Smith <java.dev....@gmail.com> wrote:
> 
> So this is what I gathered from this experience.
> 
> When running commands on Visor's console, Visor will attempt to connect to 
> the thick client.
> 
> For example if you type the "node" command and attempt to get detailed 
> statistics for a specific thick client, Visor will pause on the data region 
> stats until it can connect.
> 
> Furthermore if you have multiple thick clients and Visor has not connected to 
> some of them yet and you call a more global command like "cache", this 
> command will also pause until a connection has been made to all thick clients.
> 
> 1- Whether this is good behaviour or not is up for debate. Especially the 
> part when a thick client is listed in the topology/nodes but cannot be 
> reached and visor hangs indefinitely.
> 2- Not sure if this behaviour in any way affects the server node if they ever 
> attempt to open a connection to a thick client and the protocol somehow 
> freezes just like #1 above.
> 
> On Tue, 30 Jun 2020 at 09:54, John Smith <java.dev....@gmail.com 
> <mailto:java.dev....@gmail.com>> wrote:
> Ok so. Is this expected behaviour? From user perspective this seems like a 
> bug.
> 
> Visor is supposed to be used as a way to monitor...
> 
> So if as a user we enter a command and it just freezes indefinently it just 
> seems unfriendly.
> 
> In another thread the the team mentioned that they are working on something 
> that does not require the protocol to communicate back to a thick client. So 
> wondering if this is in a way related as well...
> 
> On Tue., Jun. 30, 2020, 6:58 a.m. Ilya Kasnacheev, <ilya.kasnach...@gmail.com 
> <mailto:ilya.kasnach...@gmail.com>> wrote:
> Hello!
> 
> I can see the following in the thread dump:
> "main" #1 prio=5 os_prio=0 tid=0x00007f02c400d800 nid=0x1e43 runnable 
> [0x00007f02cad1e000]
>    java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.Net.poll(Native Method)
> at sun.nio.ch.SocketChannelImpl.poll(SocketChannelImpl.java:951)
> - locked <0x00000000ec066048> (a java.lang.Object)
> at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:121)
> - locked <0x00000000ec066038> (a java.lang.Object)
> at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3299)
> at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2987)
> at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2870)
> at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2713)
> at 
> org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2672)
> at 
> org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1656)
> at 
> org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1731)
> at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1436)
> at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:666)
> at 
> org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:538)
> at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
> at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:764)
> at 
> org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:392)
> at 
> org.apache.ignite.internal.IgniteComputeImpl.executeAsync0(IgniteComputeImpl.java:528)
> at 
> org.apache.ignite.internal.IgniteComputeImpl.execute(IgniteComputeImpl.java:498)
> at org.apache.ignite.visor.visor$.execute(visor.scala:1800)
> 
> It seems that Visor is trying to connect to client node via Communication, 
> and it fails because the network connection is filtered out.
> 
> Regards,
> -- 
> Ilya Kasnacheev
> 
> 
> пн, 29 июн. 2020 г. в 23:47, John Smith <java.dev....@gmail.com 
> <mailto:java.dev....@gmail.com>>:
> Ok.
> 
> I am able to reproduce the "issue" unless we have a misunderstanding and we 
> are talking about the same thing...
> 
> My thick client runs inside a container in a closed network NOT bridged and 
> NOT host. I added a flag to my application that allows it to add the address 
> resolver to the config.
> 
> 1- If I disable address resolution and I connect with visor to the cluster 
> and try to print detailed statistics for that particular client, visor 
> freezes indefinitely at the Data Region Snapshot. 
> Control C doesn't kill the visor either. It just stuck. This also happens 
> when running the cache command. Just freezes indefinitely.
> 
> I attached the jstack output to the email but it is also here: 
> https://www.dropbox.com/s/wujcee1gd87gk6o/jstack.out?dl=0 
> <https://www.dropbox.com/s/wujcee1gd87gk6o/jstack.out?dl=0>
> 
> 2- If I enable address resolution for the thick client then all the commands 
> work ok. I also see an "Accepted incoming communication connection" log in 
> the client.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Mon, 29 Jun 2020 at 15:30, Ilya Kasnacheev <ilya.kasnach...@gmail.com 
> <mailto:ilya.kasnach...@gmail.com>> wrote:
> Hello!
> 
> The easiest way is jstack <process id of visor>
> 
> Regards,
> -- 
> Ilya Kasnacheev
> 
> 
> пн, 29 июн. 2020 г. в 20:20, John Smith <java.dev....@gmail.com 
> <mailto:java.dev....@gmail.com>>:
> How?
> 
> On Mon, 29 Jun 2020 at 12:03, Ilya Kasnacheev <ilya.kasnach...@gmail.com 
> <mailto:ilya.kasnach...@gmail.com>> wrote:
> Hello!
> 
> Try collecting thread dump from Visor as it freezes.
> 
> Regards,
> -- 
> Ilya Kasnacheev
> 
> 
> пн, 29 июн. 2020 г. в 18:11, John Smith <java.dev....@gmail.com 
> <mailto:java.dev....@gmail.com>>:
> How though?
> 
> 1- Entered node command
> 2- Got list of nodes, including thick clients
> 3- Selected thick client
> 4- Entered Y for detailed statistics
> 5- Snapshot details displayed
> 6- Data region stats frozen
> 
> I think the address resolution is working for this as well. I need to 
> confirm. Because I fixed the resolver as per your solution and visor no 
> longer freezes on #6 above.
> 
> On Mon, 29 Jun 2020 at 10:54, Ilya Kasnacheev <ilya.kasnach...@gmail.com 
> <mailto:ilya.kasnach...@gmail.com>> wrote:
> Hello!
> 
> This usually means there's no connectivity between node and Visor.
> 
> Regards,
> -- 
> Ilya Kasnacheev
> 
> 
> пн, 29 июн. 2020 г. в 17:01, John Smith <java.dev....@gmail.com 
> <mailto:java.dev....@gmail.com>>:
> Also I think for Visor as well?
> 
> When I do top or node commands, I can see the thick client. But when I look 
> at detailed statistics for that particular thick client it freezes 
> "indefinitely". Regular statistics it seems ok.
> 
> On Mon, 29 Jun 2020 at 08:08, Ilya Kasnacheev <ilya.kasnach...@gmail.com 
> <mailto:ilya.kasnach...@gmail.com>> wrote:
> Hello!
> 
> For thick clients, you need both 47100 and 47500, both directions (perhaps 
> for 47500 only client -> server is sufficient, but for 47100, both are 
> needed).
> 
> For thin clients, 10800 is enough. For control.sh, 11211.
> 
> Regards,
> -- 
> Ilya Kasnacheev
> 
> 
> пт, 26 июн. 2020 г. в 22:06, John Smith <java.dev....@gmail.com 
> <mailto:java.dev....@gmail.com>>:
> I'm askin in separate question so people can search for it if they ever come 
> across this...
> 
> My server nodes are started as and I also connect the client as such.
> 
>                   <bean 
> class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
>                       <property name="addresses">
>                           <list>
>                             <value>foo:47500</value>
> ...
>                           </list>
>                       </property>
>                   </bean>
> 
> In my client code I used the basic address resolver
> 
> And I put in the map
> 
> "{internalHostIP}:47500", "{externalHostIp}:{externalPort}"
> 
> igniteConfig.setAddressResolver(addrResolver);
> 
> QUESTIONS
> ___________________
> 
> 1- Port 47500 is used for discovery only?
> 2- Port 47100 is used for actual coms to the nodes?
> 3- In my container environment I have only mapped 47100, do I also need to 
> map for 47500 for the Tcp Discovery SPI?
> 4- When I connect with Visor and I try to look at details for the client node 
> it blocks. I'm assuming that's because visor cannot connect back to the 
> client at 47100?
> Se logs below
> 
> LOGS
> ___________________
> 
> When I look at the client logs I get...
> 
> IgniteConfiguration [
> igniteInstanceName=xxxxxx, 
> ...
> discoSpi=TcpDiscoverySpi [
>   addrRslvr=null, <--- Do I need to use BasicResolver or here???
> ...
>   commSpi=TcpCommunicationSpi [
> ...
>     locAddr=null, 
>     locHost=null, 
>     locPort=47100, 
>     addrRslvr=null, <--- Do I need to use BasicResolver or here???
> ...
>     ], 
> ...
>     addrRslvr=BasicAddressResolver [
>       inetAddrMap={}, 
>       inetSockAddrMap={/internalIp:47100=/externalIp:2389} <---- 
>     ], 
> ...
>     clientMode=true, 
> ...
> 
>

Re: How to do address resolution?

Reply via email to