Rainer, Thanks very much for the clarification! Since I have playing with the load balancing strategy set to session ("worker.router.method=S" on my load balancer), is there a way to tell roughly how many sessions have been pinned to each worker/tomcat? In this case would the load balancer value be (something like) the number of new sessions sent to a particular worker divided by two some number of times? If this were true you still would not know the number of sessions pinned to a worked because of the factors of two having been divided out. I just got a HTTP JMX adapter wired up in Tomcat so I'll see if I can get session info that way...
Thanks again, Brian -->-----Original Message----- -->From: Rainer Jung [mailto:[EMAIL PROTECTED] -->Sent: Thursday, August 23, 2007 11:22 AM -->To: Tomcat Users List -->Subject: Re: JK Loadbalancer not balancing fairly --> -->[EMAIL PROTECTED] schrieb: -->> Ben, -->> -->> So I assume you have two web servers fronting two app servers - or -->> there are two servers both of which have a web server and an app -->> server? For the restart you talk about - did you restart both web -->> servers? Do you have a good load balancer (local director, content -->> director like an F5) in front of the two web servers? -->> -->> If I am reading your JKStatus text correctly I noticed the -->following: -->> -->> Load balancer value on web server 2 -->> ----------------------------------- = ~0.56 Load balancer -->value on web -->> server 1 -->> -->> but -->> -->> Number requests on web server 2 -->> ----------------------------------- = ~0.91 Number requests on web -->> server 1 -->> -->> -->> Now, if I am interpreting the meaning of "load balancer value" and -->> "number of reuqests" correctly, that would imply that the -->number of -->> sessions stuck to each app server from web server 1 is -->very roughly -->> twice as high as from 2, but the total number of requests -->sent to each -->> app server from both web servers is very roughly the same. (Can -->> someone confirm I'm intrepreting those #s correctly?) --> -->The number of requests is the total since last jk/apache -->restart. So if the last restart was shortly before, the -->numbers will not help. If they were not reset after the -->tests, we would know, that Apache 1 had a little more -->requests than apache 2, but both of them send exacty the -->same number of requests to the two tomcat nodes (delta=1 request). --> -->The V column is the balancing value used to decide, where -->the next request goes to. It is the number of requests sent -->to the tomcat divided by two once a minute, so it is -->multiplied by a decay curve. The big difference between the -->V values of apache 1 and apache 2 does not matter. It could -->simply mean, that the one with the bigger V value did it's -->division more recent in time. The V values for the two -->tomcats are again very similar on the same Apache, another -->indication of good balancing. --> -->All his is true for the default balancing method "Requests". --> -->I would suggest first to follow CPU by Tomcat process over -->the test period (not per system and not simply as one -->number, instead as a graph over time). --> -->> According to the docs, each connect by default trys to -->keep the number -->> of requests sent to each worker the same, which looks to -->be happening -->> reasonably well. (I'm playing with trying the keep the number of -->> sessions balanced since our apps tend to be more of a memory issue -->> than a cpu issue. There is a setting on the connector for this.) -->> -->> With a some info on your setup we can try to figure out the load -->> imbalance. -->> -->> As a note, I am playing with the jk1.2.x connector, but -->our productio -->> systems use the old jk2.x connector. With that, I've seen a load -->> imbalance on the app servers when one of the app serves -->has gone down -->> for a while, and then has come back up. If the connectors are not -->> reset, they will try to "catch up" the restarted app -->server in terms -->> of the number of requests it has handled, thus loading it -->more heavily -->> than servers that have been up the whole time. --> -->The catchup problem should be fixed. A recovered or -->reactivated worker gets the biggest "work done" value of all -->other workers, so it should start normal or even a little -->less loaded. --> -->> -->> Brian --> -->Regards, --> -->Rainer --> -->--------------------------------------------------------------------- -->To start a new topic, e-mail: users@tomcat.apache.org To -->unsubscribe, e-mail: [EMAIL PROTECTED] -->For additional commands, e-mail: [EMAIL PROTECTED] --> --> --------------------------------------------------------------------- To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]