Hi Luca,

Thanks for the details.
1. our server's ulimit values are:
]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63714
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Please let me know whether the values are sufficient to allow at least 500
concurrent connections.

2. Yes I checked mod_jk log when hang happens, and getting below errors
continuously.

[Wed Apr 19 02:00:38 2017]loadbalancer www.cmsp1.com 24.843284
[Wed Apr 19 02:00:38 2017][16313:3878614784] [info]
ajp_process_callback::jk_ajp_common.c (1788): Writing to client aborted or
client network problems
[Wed Apr 19 02:00:38 2017][16313:3878614784] [info]
ajp_service::jk_ajp_common.c (2447): (qu_prod_live_svr1) sending request to
tomcat failed (unrecoverable), because of client write error (attempt=1)
[Wed Apr 19 02:00:38 2017][16313:3878614784] [info] service::jk_lb_worker.c
(1384): service failed, worker qu_prod_live_svr1 is in local error state
[Wed Apr 19 02:00:38 2017][16313:3878614784] [info] service::jk_lb_worker.c
(1403): unrecoverable error 200, request failed. Client failed in the
middle of request, we can't recover to another instance.
[Wed Apr 19 02:00:38 2017]loadbalancer www.cmsp1.com 19.170901
[Wed Apr 19 02:00:38 2017][16313:3878614784] [info] jk_handler::mod_jk.c
(2608): Aborting connection for worker=loadbalancer
[Wed Apr 19 02:00:39 2017][16261:3878614784] [warn]
map_uri_to_worker_ext::jk_uri_worker_map.c (962): Uri * is invalid. Uri
must start with /
[Wed Apr 19 02:00:40 2017][16308:3878614784] [warn]
map_uri_to_worker_ext::jk_uri_worker_map.c (962): Uri * is invalid. Uri
must start with /

3. We will upgrade to 2.4.25, could you please share optimal configuration
for mpm-event to allow more concurrent users, please.

Thanks
Jay


On Tue, Apr 18, 2017 at 10:03 AM, Luca Toscano <toscano.l...@gmail.com>
wrote:

> Hi,
>
> Some suggestions:
>
> 1) check your RHEL ulimits applied to httpd, the error message "Resource
> temporarily unavailable: setuid: unable to change to uid" could be related
> to maximum number of processes (allowed by the OS) reached. This should
> allow you to spawn more httpd processes.
>
> 2) Have you checked when the "hang" happens? If you have long lived
> connections and your httpd server reloads (for example for log rotation)
> then it might hang a bit while waiting for the remaining connections to
> drain.
>
> 3) If possible I'd consider to upgrade httpd to >= 2.4.25 and use
> mpm-event (rather than prefork).
>
> Hope that helps!
>
> Luca
>
>
> 2017-04-16 13:18 GMT+02:00 Jayaram Ponnusamy <jayaram.ponnus...@gmail.com>
> :
>
>> Dear All,
>>
>> We were runnig our site in PHP based CMS tool earlier, and normally
>> 20-30K users will access our sites daily. But in new system with Tomcat, we
>> are facing performance and availability issue frequently, when i access the
>> tomcat url directly the page is loading within 3seconds, but if we access
>> webServer URL then its taking more than 9seconds.
>>
>> Also, Each day I am seeing more and more of these in my error_logs, and
>> when the Total Children value is reached 999 the Apache is not responding
>> and Server reboot only help to bring the site back. Every day atleast 4-5
>> times we are facing this issue (we are using mod_jk to connect with tomcat).
>>
>> Kindly please help on this.
>>
>> Usually I am seeing this on my error_log:
>> [Sat Apr 15 20:49:33 2017] [info] server seems busy, (you may need to
>> increase StartServers, or Min/MaxSpareServers), spawning 8 children, there
>> are 4 idle, and 31 total children
>> [Sat Apr 15 20:51:14 2017] [info] server seems busy, (you may need to
>> increase StartServers, or Min/MaxSpareServers), spawning 8 children, there
>> are 0 idle, and 20 total children
>> [Sat Apr 15 20:51:15 2017] [info] server seems busy, (you may need to
>> increase StartServers, or Min/MaxSpareServers), spawning 16 children, there
>> are 0 idle, and 28 total children
>> [Sat Apr 15 20:51:16 2017] [info] server seems busy, (you may need to
>> increase StartServers, or Min/MaxSpareServers), spawning 32 children, there
>> are 0 idle, and 44 total children
>> We are using two Apache Nodes and Connected with Two Tomcat (at
>> Application Level Clustering).
>> Apache Servers:
>> 4 Core 64-bit, Rhel System running on 16GB RAM (Both Servers)
>> Server version: Apache/2.2.21 (Unix)
>>
>> *httpd.conf*
>> KeepAlive On
>> Timeout 300
>> MaxKeepAliveRequests 100
>> KeepAliveTimeout 15
>> <IfModule prefork.c>
>> StartServers         80
>> ServerLimit 3500
>> MaxClients 3500
>> MaxRequestsPerChild  0
>> </IfModule>
>>
>> *workers.properties*
>> worker.list=loadbalancer,status
>> worker.qu_prod_live_svr.type=ajp13
>> worker.qu_prod_live_svr.host=cmsp1
>> worker.qu_prod_live_svr.port=8009
>> worker.qu_prod_live_svr.socket_keepalive=1
>> worker.qu_prod_live_svr.socket_timeout=300
>> worker.qu_prod_live_svr1.type=ajp13
>> worker.qu_prod_live_svr1.host=cmsp2
>> worker.qu_prod_live_svr1.port=8009
>> worker.qu_prod_live_svr1.socket_keepalive=1
>> worker.qu_prod_live_svr1.socket_timeout=300
>> worker.qu_prod_live_svr.lbfactor=1
>> worker.qu_prod_live_svr1.lbfactor=1
>> worker.loadbalancer.type=lb
>> worker.loadbalancer.balance_workers=qu_prod_live_svr,qu_prod_live_svr1
>> worker.status.type=status
>>
>> *Tomcat Servers:*
>> 4 Core 64-bit, Rhel System running on 16GB RAM (Both Servers)
>> Server version: Apache Tomcat/7.0.42
>> <Connector port="9090" protocol="HTTP/1.1" redirectPort="8443"
>> URIEncoding="UTF-8" emptySessionPath="true" maxThreads="500"
>> minSpareThreads="10" connectionTimeout="-1" />
>> <Connector port="8009" protocol="AJP/1.3" redirectPort="8443"
>> URIEncoding="UTF-8" />
>>
>> *error_log:*
>> [Sat Apr 15 21:52:36 2017] [info] server seems busy, (you may need to
>> increase StartServers, or Min/MaxSpareServers), spawning 32 children, there
>> are 0 idle, and 839 total children
>> [Sat Apr 15 21:52:37 2017] [info] server seems busy, (you may need to
>> increase StartServers, or Min/MaxSpareServers), spawning 32 children, there
>> are 0 idle, and 871 total children
>> [Sat Apr 15 21:52:38 2017] [info] server seems busy, (you may need to
>> increase StartServers, or Min/MaxSpareServers), spawning 32 children, there
>> are 0 idle, and 903 total children
>> [Sat Apr 15 21:52:39 2017] [info] server seems busy, (you may need to
>> increase StartServers, or Min/MaxSpareServers), spawning 32 children, there
>> are 0 idle, and 935 total children
>> [Sat Apr 15 21:52:40 2017] [info] server seems busy, (you may need to
>> increase StartServers, or Min/MaxSpareServers), spawning 32 children, there
>> are 0 idle, and 967 total children
>> [Sat Apr 15 21:52:41 2017] [info] server seems busy, (you may need to
>> increase StartServers, or Min/MaxSpareServers), spawning 32 children, there
>> are 0 idle, and 999 total children
>> [Sat Apr 15 21:52:41 2017] [alert] (11)Resource temporarily unavailable:
>> setuid: unable to change to uid: 2
>> [Sat Apr 15 21:52:41 2017] [alert] (11)Resource temporarily unavailable:
>> setuid: unable to change to uid: 2
>> [Sat Apr 15 21:52:41 2017] [alert] (11)Resource temporarily unavailable:
>> setuid: unable to change to uid: 2
>> [Sat Apr 15 21:52:41 2017] [alert] (11)Resource temporarily unavailable:
>> setuid: unable to change to uid: 2
>> [Sat Apr 15 21:52:41 2017] [alert] Child 9351 returned a Fatal error...
>> Apache is exiting!
>> [Sat Apr 15 21:52:41 2017] [alert] (11)Resource temporarily unavailable:
>> setuid: unable to change to uid: 2
>> [Sat Apr 15 21:52:41 2017] [alert] (11)Resource temporarily unavailable:
>> setuid: unable to change to uid: 2
>> [Sat Apr 15 21:52:41 2017] [alert] (11)Resource temporarily unavailable:
>> setuid: unable to change to uid: 2
>> [Sat Apr 15 21:53:06 2017] [error] (22)Invalid argument:
>> apr_global_mutex_lock(jk_log_lock) failed
>> [Sat Apr 15 21:53:06 2017] [error] mod_jk: jk_log_to_file
>> [Sat Apr 15 21:53:06 2017][8752:4177577728] [info]
>> ajp_connection_tcp_get_message::jk_ajp_common.c (1150):
>> (qu_prod_live_svr1) can't receive the response header message from tomcat,
>> network problems or tomcat (10.11.11.32:8009) is down (errno=104)\n
>> failed: Broken pipe
>> [Sat Apr 15 21:53:06 2017] [error] (22)Invalid argument:
>> apr_global_mutex_unlock(jk_log_lock) failed
>> [Sat Apr 15 21:53:06 2017] [error] (22)Invalid argument:
>> apr_global_mutex_lock(jk_log_lock) failed
>> [Sat Apr 15 21:53:06 2017] [error] mod_jk: jk_log_to_file [Sat Apr 15
>> 21:53:06 2017][8752:4177577728] [error] ajp_get_reply::jk_ajp_common.c
>> (1962): (qu_prod_live_svr1) Tomcat is down or refused connection. No
>> response has been sent to the client (yet)\n failed: Broken pipe
>> [Sat Apr 15 21:53:06 2017] [error] (22)Invalid argument:
>> apr_global_mutex_unlock(jk_log_lock) failed
>>
>>
>> *Thanks & Regards,*
>> *Jay*
>>
>
>

Reply via email to