I had a look at the Cluster Receiver object reference, and I'm pretty sure it 
must be the local address where to listen to incoming data. Since the multicast 
route is set on the eth1 interface, I use the relative IP address (10.x). 

>From the documentation:

address: The address (network interface) to listen for incoming traffic. Same 
as the bind address. The default value is auto and translates to 
java.net.InetAddress.getLocalHost().getHostAddress(). 

Any other pointers?
Jim



________________________________
From: Jorge Medina <jmed...@e-dialog.com>
To: Tomcat Users List <users@tomcat.apache.org>
Sent: Thursday, April 2, 2009 4:31:06 PM
Subject: RE: tomcat 6 session replication issues

What is your multicast address and port used by Tomcat to discover
members of the cluster?

Your sever.xml has a note [10.x.x.x]. This does not look like a
multicast address. 

http://tldp.org/HOWTO/Multicast-HOWTO-2.html




________________________________

From: Jimmy Phillips [mailto:jimmy.phillip...@yahoo.com] 
Sent: Thursday, April 02, 2009 11:21 AM
To: users@tomcat.apache.org
Subject: tomcat 6 session replication issues


Hi,

I've been having issues with tomcat session replication. I have a number
of tomcat servers running in a cluster mode, behind an Apache load
balancer. The tomcat version is 6.0.18 on CentOS 5.1. Running the
cluster using the DeltaManager seems to be working fine, however when I
try to use the BackupManager for session replication, I get the
following entries in the logs:

Apr 1, 2009 3:28:42 AM
org.apache.catalina.tribes.transport.nio.NioReceiver socketTimeouts
WARNING: Channel key is registered, but has had no interest ops for the
last 3000 ms. (cancelled:false):sun.nio.ch.selectionkeyi...@62af9d74
last access:2009-04-01 03:28:35.969
Apr 1, 2009 3:28:42 AM
org.apache.catalina.tribes.transport.nio.NioReceiver socketTimeouts
WARNING: Channel key is registered, but has had no interest ops for the
last 3000 ms. (cancelled:false):sun.nio.ch.selectionkeyi...@4c4947d3
last access:2009-04-01 03:28:35.969
Apr 1, 2009 3:29:04 AM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Received
memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp:/
/{10, 99, 86, 47}:4000,{10, 99, 86, 47},4000, alive=1380182,id={-121 25
-2 -7 81 -1 76 3 -92 -20 122 69 67 102 -31 -15 }, payload={},
command={}, domain={}, ]] message. Will verify.
Apr 1, 2009 3:29:04 AM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Verification complete. Member still
alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 99,
86, 47}:4000,{10, 99, 86, 47},4000, alive=1380182,id={-121 25 -2 -7 81
-1 76 3 -92 -20 122 69 67 102 -31 -15 }, payload={}, command={},
domain={}, ]]
Apr 1, 2009 3:29:04 AM
org.apache.catalina.tribes.tipis.AbstractReplicatedMap heartbeat
SEVERE: Unable to send AbstractReplicatedMap.ping message
org.apache.catalina.tribes.ChannelException: Operation has timed
out(60000 ms.).; Faulty members:tcp://{10, 99, 86, 47}:4000;
        at
org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(P
arallelNioSender.java:97)
        at
org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessag
e(PooledParallelSender.java:53)
        at
org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(
ReplicationTransmitter.java:80)
        at
org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelC
oordinator.java:78)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(Chan
nelInterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.send
Message(ThroughputInterceptor.java:61)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(Chan
nelInterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor
..sendMessage(MessageDispatchInterceptor.java:73)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(Chan
nelInterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMes
sage(TcpFailureDetector.java:87)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(Chan
nelInterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216
)
        at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175
)
        at
org.apache.catalina.tribes.group.RpcChannel.send(RpcChannel.java:89)
        at
org.apache.catalina.tribes.tipis.AbstractReplicatedMap.ping(AbstractRepl
icatedMap.java:253)
        at
org.apache.catalina.tribes.tipis.AbstractReplicatedMap.heartbeat(Abstrac
tReplicatedMap.java:793)
        at
org.apache.catalina.tribes.group.GroupChannel.heartbeat(GroupChannel.jav
a:153)
        at
org.apache.catalina.tribes.group.GroupChannel$HeartbeatThread.run(GroupC
hannel.java:661)

Of course the above entry is just one of many, for the different hosts.
Searching the mailing lists, I found this post
http://markmail.org/message/jv4dykh7fdhr4mvp which looks like the same
problem I am having. The outcome of that thread states that the problem
is fixed by a patch in revision 618823, so I compiled a version of the
current 6.x trunk (rev 759722) and deployed it to all the servers.
However, the problem is still appearing. I've attached a copy of the
current server.xml ( it is common to all tomcat instances ).

I've done a thread dump on one of the servers when these errors started
appearing, and the output is attached, thread_dump.txt (removed threads
that were running by our application).

This problem is reproducable each time I restart the servers. At this
stage, I'm clueless on what to try next, so I'm looking forward to your
replies.


Regards,
Jim.

Attached: server.xml, thread_dump.txt


      

Reply via email to