Re: Help with Tomcat 7 clustering using BIO receiver

Filip Hanik Thu, 03 Jul 2014 12:12:28 -0700

   1. are your machines in time sync? If they are not, a session can get
   timed out.
   2.
   3. SEVERE: Manager [localhost#/myApp]: Unable to receive message through
   TCP channel
   4. java.lang.IllegalStateException: setAttribute: Session [
   DEC3612CF763194E7953DB3FD2C433E0] has already been invalidated
   5.         at org.apache.catalina.session.StandardSession.setAttribute(
   StandardSession.java:1437)
   6.         at org.apache.catalina.ha.session.DeltaSession.setAttribute(
   DeltaSession.java:695)
   7.         at org.apache.catalina.ha.session.DeltaRequest.execute(
   DeltaRequest.java:168)
   8.         at
   org.apache.catalina.ha.session.DeltaManager.handleSESSION_DELTA(
   DeltaManager.java:1337)
   9.         at org.apache.catalina.ha.session.DeltaManager.messageReceived
   (DeltaManager.java:1283)
   10.         at
   org.apache.catalina.ha.session.DeltaManager.messageDataReceived(
   DeltaManager.java:1001)
   11.         at
   org.apache.catalina.ha.session.ClusterSessionListener.messageReceived(
   ClusterSessionListener.java:91)
   12.         at
   org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(
   SimpleTcpCluster.java:943)
   13.         at
   org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(
   SimpleTcpCluster.java:924)
   14.         at
   org.apache.catalina.tribes.group.GroupChannel.messageReceived(
   GroupChannel.java:278)





On Thu, Jul 3, 2014 at 1:07 PM, João Sávio <joaosa...@gmail.com> wrote:

> Hi everyone
>
> I ran my test (total of 1k requests, total of 100 threads) against two
> nodes with default VM settings. I've just set heap size. I had about 15% of
> errors.
>
> cluster.log - node1 - http://pastebin.com/cpX900Qw
> cluster.log - node2 - http://pastebin.com/qCSzMaU6
>
> Running for a long time (total of 500k requests, total of 100 threads) I
> had about 11% of errors. In this case we can see the statistics:
>
> Jul 03, 2014 5:53:28 PM
> org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor report
> INFO: ThroughputInterceptor Report[
> Tx Msg:10000 messages
> Sent:12.36 MB (total)
>  Sent:12.36 MB (application)
> Time:7.82 seconds
> Tx Speed:1.58 MB/sec (total)
>  TxSpeed:1.58 MB/sec (application)
> Error Msg:0
> Rx Msg:10198 messages
>  Rx Speed:0.08 MB/sec (since 1st msg)
> Received:12.38 MB]
>
>
> All session attributes are Serializable, and it's a session replication
> issue because if I ran my test with just one node, I had 0% of errors.
>
> Regarding "on time", just a correction:
>
> 1. first request, pick a random server and store a session object
> 2. second request, pick *ANY* server (chose by LB based on the cookie - it
> can be the same, but not necessarily) and ask for the session object.
>
> To be more clean, I've been working with a conference system. Each
> conference should occur in one node. So, the first request can hit any
> server, and from the second request should hit the node where the
> conference is.
>
>
> Thanks a lot
> João
>
>
> 2014-07-03 15:40 GMT-03:00 Filip Hanik <fi...@hanik.com>:
>
> > you mention NIO and say maxThreads, that sounds like the <Connector>
> > configuration, but the BIO receiver is on the cluster, and it a
> completely
> > different component that also has an applicable NIO configuration.
> >
> > are you confusing the two?
> > I'm saying that you should use the NIO receiver on the cluster component,
> > and if you do, what kind of errors do you get?
> >
> >
> > On Thu, Jul 3, 2014 at 12:19 PM, Mark Eggers
> <its_toas...@yahoo.com.invalid
> > >
> > wrote:
> >
> > > -----BEGIN PGP SIGNED MESSAGE-----
> > > Hash: SHA1
> > >
> > > João,
> > >
> > > This list has a convention of posting either inline or at the end of
> > > the message you're replying to.
> > >
> > > See here for mailing list notes:
> > >
> > > http://tomcat.apache.org/lists.html#tomcat-users
> > >
> > > On 7/3/2014 10:24 AM, João Sávio wrote:
> > > > Hello
> > > >
> > > > Some points below:
> > > >
> > > > ** What is "on time"?* In my application, a group of users should
> > > > always hit the same node after the first request. So, in first
> > > > request each group of users will receive an specific cookie, and
> > > > LB will perform the load balancing based on this cookie. In first
> > > > request, a user can hit any node, but from the second, he or she
> > > > should hit the same node.
> > >
> > > Hmm, so 'on time' really means that subsequent requests should hit the
> > > same server.
> > >
> > > If you're using sessions, Tomcat has an attribute on the Engine
> > > element called jvmRoute. So depending on your load balancer (and if
> > > you use AJP), you can use Tomcat and AJP to route traffic. In that
> > > case, there's no need to write a special cookie.
> > >
> > > At any rate, this doesn't sound like a clustering error per se.
> > >
> > > >
> > > > ** What are the errors? Test result errors?* For this test, I
> > > > simplified the code of my application: - first request: store one
> > > > object in session - second request: verify if the object is in
> > > > session. If it's not -> ERROR
> > > >
> > >
> > > So looking at the information from 'on time', the scenario should be:
> > >
> > > 1. first request, pick a random server and store a session object
> > > 2. second request, pick the SAME server and ask for the session object
> > >
> > > Again, I'm not seeing where this is a clustering issue per se.
> > >
> > > > ** How big are are the sessions that you're trying to replicate?*
> > > > - I'm using Spring MVC, and I have 3 additional objects in
> > > > session. They are not big (15 attributes each one)
> > > >
> > >
> > > And all attributes are serializable? The objects are also marked as
> > > serializable?
> > >
> > > > ** What's the load like on the box when you're running the tests
> > > > that you get errors on?* - I've experiencing this issue on BIO
> > > > even without load
> > > >
> > >
> > > I may have not phrased my question carefully. What is the CPU and
> > > memory situation on your test box while running the 4 Tomcat servers?
> > >
> > > I know you've trimmed down your Xms and Xmx (presumably to fit in your
> > > test environment), but in combination with your other JVM parameters
> > > could this be causing some issues?
> > >
> > > I would follow Dan's recommendation of maybe just setting Xms, Xmx, GC
> > > logging to see what happens. Ah, I see you're going to do that below.
> > >
> > > > ** It is preferred to use the non blocking receiver to be able to
> > > > grow your cluster without running into thread starvation.* -
> > > > That's why I've tried NIO first, but I'd like to see if BIO solve
> > > > my issue and if using BIO my system doesn't get too slow.
> > >
> > > I don't think speed is so much an issue here, but scalability is. NIO
> > > can handle multiple requests per thread, BIO cannot.
> > >
> > > >
> > > >
> > > > Now, I'll try to run my tests using NIO, default VM configuration
> > > > and FINER logs.
> > >
> > > Post the results when you get them. If the logs are relatively small,
> > > just cut and paste into the mail message.
> > >
> > > I suspect FINER is going to generate LOTS of logging and slow down
> > > your application.
> > >
> > > >
> > > > Thanks a lot João
> > >
> > > . . . . just my two cents
> > > /mde/
> > >
> > > >
> > > >
> > > > 2014-07-03 14:07 GMT-03:00 Mark Eggers
> > > > <its_toas...@yahoo.com.invalid>:
> > > >
> > > > On 7/3/2014 9:12 AM, João Sávio wrote:
> > > >>>> cluster.log -> http://pastebin.com/c98WhnmG
> > > >>>>
> > > >>>>
> > > >>>> 2014-07-03 13:04 GMT-03:00 João Sávio <joaosa...@gmail.com>:
> > > >>>>
> > > >>>>> Hello!
> > > >>>>>
> > > >>>>> Using NIO (with channelSendOptions="4", i.e.,
> > > >>>>> synchronous), with lightly load, my tests pass 100%. But,
> > > >>>>> on heavy load, not all sessions are replicated on time, and
> > > >>>>> I have about 20% of errors. If I increase maxThreads to
> > > >>>>> 400, I have about 15% of errors.
> > > >>>>>
> > > >>>>> More information: * I am not performing parallel requests
> > > >>>>> with same session * my cluster has 4 nodes (all in one
> > > >>>>> machine - for test purpose only) * Java 7 64 bits, Tomcat
> > > >>>>> 7.0.52, windows 7 64 bits * using default NIO
> > > >>>>> configuration, but with maxThreads=400 * VM options:
> > > >>>>> -Xms512M   - on real environment this value is 1024
> > > >>>>> -Xmx512M  - on real environment this value is 1024
> > > >>>>> -XX:NewSize=450M -XX:MaxNewSize=450M -XX:PermSize=128M
> > > >>>>> -XX:MaxPermSize=245M -XX:SurvivorRatio=8
> > > >>>>> -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=15
> > > >>>>> -XX:+UseBiasedLocking -XX:CMSInitiatingOccupancyFraction=60
> > > >>>>>  -XX:+UseCMSInitiatingOccupancyOnly
> > > >>>>> -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC
> > > >>>>> -XX:+CMSIncrementalMode -XX:+UseParNewGC
> > > >>>>> -XX:+DisableExplicitGC -XX:+PrintGCDateStamps
> > > >>>>> -XX:+PrintGCDetails
> > > >>>>> -Xloggc:%CATALINA_BASE%/logs/tomcat-gc.log
> > > >>>>>
> > > >>>>> Moreover, I'm trying to attach the logs again.
> > > >>>>>
> > > >>>>> Thanks João
> > > >
> > > > João,
> > > >
> > > > I took a look at the log. This is the BIO attempt and you do run
> > > > out of threads. See the following:
> > > >
> > > > Jul 03, 2014 11:41:21 AM
> > > > org.apache.catalina.tribes.transport.bio.BioReceiver listen
> > > > WARNING: All BIO server replication threads are busy, unable to
> > > > handle more requests until a thread is freed up.
> > > >
> > > > What's the load like on the box when you're running the tests that
> > > > you get errors on?
> > > >
> > > > As Dan asks in his message:
> > > >
> > > > What is "on time"? What are the errors? Test result errors?
> > > >
> > > > How big are are the sessions that you're trying to replicate?
> > > >
> > > > My guess is that something else is going on, since the following
> > > > log entry doesn't show much in the way of cluster traffic.
> > > >
> > > > INFO: ThroughputInterceptor Report[ Tx Msg:1 messages Sent:0.00 MB
> > > > (total) Sent:0.00 MB (application) Time:0.01 seconds Tx Speed:0.04
> > > > MB/sec (total) TxSpeed:0.04 MB/sec (application) Error Msg:0 Rx
> > > > Msg:13 messages Rx Speed:0.00 MB/sec (since 1st msg) Received:0.00
> > > > MB]
> > > >
> > > > It would also be interesting to see the logs when you use the NIO
> > > > connector. According to the documentation:
> > > >
> > > > It is preferred to use the non blocking receiver to be able to grow
> > > > your cluster without running into thread starvation.
> > > >
> > > > Also from the documentation:
> > > >
> > > > Usually the rule is to use 1 thread per node in the cluster for
> > > > small clusters, and then depending on your message frequency and
> > > > your hardware, you'll find an optimal number of threads peak out
> > > > at a certain number.
> > > >
> > > > We might need a little more background on your application and your
> > > > test environment to figure out why clustering is not behaving for
> > > > you.
> > > >
> > > > . . . just my two cents /mde/
> > > >
> > > > PS - you have some errors in your server.xml (see the log). While
> > > > they won't impact this problem, it might be a good idea to address
> > > > them.
> > > >
> > > > /mde/
> > > >>
> > > >>
> ---------------------------------------------------------------------
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > >> For additional commands, e-mail: users-h...@tomcat.apache.org
> > > >>
> > > >>
> > > >
> > > >
> > >
> > > -----BEGIN PGP SIGNATURE-----
> > > Version: GnuPG v1.4.13 (MingW32)
> > > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> > >
> > > iQEcBAEBAgAGBQJTtZ6rAAoJEEFGbsYNeTwtAPsH/jqUHnP5Wag9fRLUYQD582/O
> > > 7FRoBv+Iq0/VWs2o9Wv0VrAOOazUhtAG38JgCX+v70u2MJNOIcVVpXCuOjZeSJYB
> > > WRkNIXqRCrVDc3/ZX3nTQoXJheZEfrdvB5cikoARPmBJeb4kOpnxKSs97OSJjHYU
> > > uCCoXocVfDM3JxtEHXNHyy6BuIYdizvH7DwGSts7shggT/LmKmxA16AzChppwSr4
> > > 87p7jCJyxxPJ9MeRNP4uDQpV+Z/1DDhMxzUc9P8VJuSykJ1YUdQOm24AuGezsYyx
> > > ZQrLkioRnxDcOwpKSoI1o0r/2NgS97YR4GZbU6npzD1DjvPjZm4zimbNKM+l0iE=
> > > =ftU9
> > > -----END PGP SIGNATURE-----
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> > > For additional commands, e-mail: users-h...@tomcat.apache.org
> > >
> > >
> >
>
>
>
> --
> http://joaosavio.wordpress.com
>

Re: Help with Tomcat 7 clustering using BIO receiver

Reply via email to