It didn't paste as well as I'd hoped... the relevant information, formatted in text is:

Name    F        Acc        Wr        Rd        Busy    Max
webl7    100    495464   239M   2.4G       1       41
webl4    130    648407   312M   3.1G       1       50
webl6 169 1056555 507M 1.1G 7 56 <---- now getting ALL requests
webl8    220  4163167    2.0G   3.8G     28     109
webl5     25     124571    60M  606M       1       19

Where the headings come from the status tool:
Name    Worker route name
F            Load Balancer Factor
Acc        Number of requests
Wr         Number of bytes transferred
Rd         Number of bytes read
Busy     Current number of busy connections
Max      Maximum number of busy connections

Hoping this time it's readable...

On Jun 9, 2005, at 8:13 PM, Tom Anderson wrote:

I hope that the below "snapshot" of my jkstatus shows up okay. This is from my current setup using mod_jk 1.2.13 and using the method=Traffic setting.

What's not obvious from this static snapshot is that the middle webserver (webl6) is currently getting all requests. This is in spite of the fact that the 4th server shows 28 busy connections... don't believe it, it's not getting any connections. The only webserver getting requests of the 5 is the middle one.

I don't know how that happened but it appears that the Rd (bytes read) got reset so in order to "balance" things out it is now sending everything to the one server. At first I thought maybe it was because transferred, readed (sic) and mytraffic are size_t and maybe one of them rolled over. But that would rollover at 4MB right? Since I'm not fluent in this code, maybe someone who is could comment.

From my cursory look, it appears there might be a couple of issues here:

1. Relying on total bytes (requests) can lead to situations where all requests go to a single worker if one of counters gets messed up. Perhaps it would be more reliable to keep a moving average instead which might only temporarily disrupt normal operations.

2. Based on the Busy counts being incorrect, there doesn't appear to be any semaphore locking of the shared memory. Could that be why the Rd value got reset?

~Tom


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to